The X-Robots-Tag is an important HTTP header built by site owners that informs search engine crawlers how to handle the web resource. The meta robots tag is embedded in HTML pages whereas You can control which files may be indexed by using the X-Robots-Tag. It is used through HTTP response headers. As opposed to HTML, files such as PDFs, images, and videos.
Key Directives of X-Robot Tag
- noindex is a command you may use for files that you do not want to appear in search engine results.
- Nofollow tells crawlers not to go through the links in the resource.
- Directive to prevent search engines from caching a version of the resource.
- It forbids showing a text or video snippet for the searched term.
- Makes Sure Pictures On Your Page Won’t Show Up.
Implementation Example for X-robot Tag
If users want to tell search engine spiders (like Googlebot) not to index, or “spider”, an entire page on their site, (…)
Case 1: If the search engine from indexing all PDF files on the server, add to the .htacess file.
pgsqlCopyEdit.
<FilesMatch “\.pdf$”>.
Header set X-Robots-Tag “noindex”.
</FilesMatch>.
This setup guarantees that search engines won’t index any PDF file found on the server.
Just ensure the X Robots Tag does not conflict with other existing meta tags or robots.txt. Maintain consistent crawler instructions.