Create & add custom robots.txt file in Blogger – Crawl & index
Have you ever heard the name robots.txt or have you added your own custom robots.txt file in your Blogger blog. Here we will discuss about what robots.txt is and how to create and add custom robots.txt file in Blogger. All that deals with crawling and indexing of your blog which covers your blog SEO, so take note on this article.
In Blogger you have a section called search preferences that covers your blogs Meta tags, errors /redirection & crawling and indexing. There you can manage your blogs view with search engines also to maintain a healthy blog you need to customize your blogs search preferences effectively. A few posts back we made an article on how to use Blogger custom redirects. In case if you had a broken links in your blog or if you changed your blog post URL then you can make use of Blogger custom redirects. Here let’s see about robots.txt and the use of adding custom robots.txt file in Blogger.
What is robots.txt?
Search engine like Google sends spiders or crawlers whatever it may be a kind of program that travels all around the web. When these crawlers or web spiders reach your site they first go through your robots.txt file to check any robots exclusion protocol that is before crawling and indexing your pages.
Robots.txt is a normal text file which is available on all websites that is used by a webmaster to advice these crawlers about accessing several pages on a website. The pages that are restricted in your robots.txt file will won’t be crawled and indexed in search results. However all those page are viewable publicly to normal humans.
Each Blogger blog will have a robots.txt file that comes by default and it looks something like the one below. You can check your blogs robots.txt file by adding /robots.txt next to your domain name. (http://yoursite.blogspot.com/robots.txt)
User-agent: Mediapartners-Google Disallow: User-agent: * Disallow: /search Allow: / Sitemap: http://yoursite.blogspot.com/feeds/posts/default?orderby=UPDATED
So what are these?
As you can see above the default robots.txt file has few things like user-agent, media partners-Google, user-agent:*, disallow and site map. If you are not aware about these, then here is the explanation.
First you need to know about User agent which is a software agent or client software that will act on behalf of you.
Mediapartners-Google – Media partner Google is the user agent for Google adsense that is used to server better relevant ads on your site based on your content. So if you disallow this they you will won’t able to see any ads on your blocked pages.
User-agent:* – So you all know what user-agent is, so what is user-agent:*. The user-agent that is marked with (*) asterisk is applicable to all crawlers and robots that can be Bing robots, affiliate crawlers or any client software it can be.
Disallow: By adding disallow you are telling robots not to crawl and index the pages. So below the user-agent:* you can see Disallow: /search which means you are disallowing your blogs search results by default. You are disallowing crawlers in to the directory /search that comes next after your domain name. That is a search page like http://yoursite.blogspot.com/search/label/yourlabel will not be crawled and never be indexed.
Allow – Allow: / simply refers to or you are specifically allowing search engines to crawl those pages.
Sitemap: Sitemap helps to crawl and index all your accessible pages and so in default robots.txt you can see that your blog specifically allowing crawlers in to sitemaps. You can learn more about Blogger sitemap here. There is an issue with default Blogger sitemap, so learn how to create sitemap in Blogger and notify search engines.
What pages should I disallow in Blogger?
This question is little tricky and we cannot predict what pages to allow and what to disallow in your Blog. You can disallow pages like privacy policy, Terms & conditions, cloaked affiliate links, labels as well as search results and it depends all upon you. Since you get some reasonable traffic from search results it is not recommended that you disallow the labels page, privacy policy page and TOS page.
How to disallow pages in Blogger using robots.txt
You can disallow search engines to crawl and index particular pages or posts in Blogger using your robots.txt file.
We don’t have the reason to block search engines on any particular posts and if you wish so then just add Disallow: /year/month/your-post-url.html in your robots.txt file. That is copy your post URL next to your domain name and add it in your robots.txt file.
Same what you will need to do for disallowing any particular pages. Copy the page URL next to your domain name and add it like this Disallow: /p/your-page.html in your robots.txt file.
The best and recommended robots.txt file for Blogger
Only use custom robots.txt file if you are 100% sure on what you are doing. Improper use of custom robots.txt can harm your site rankings. So for best results it is recommended that you use default robots.txt in Blogger which works good. But change the default sitemap in your robots.txt and add your custom sitemap for Blogger.
How to create and add custom robots.txt file in Blogger
In wordpress we have to create a robots.txt file in notepad and upload it to the web root directory. But here in Blogger you can add the robots.txt file easily from your blog dashboard. To add a custom robots.txt just login to your Blogger profile and select your blog. Now head to dashboard >> settings >> search preferences and you can see custom robots.txt in crawling and indexing section. Click edit and enable custom robots.txt content and add your robots.txt file
Once done click save changes. Now to check your robots.txt just add /robots.txt at the end of your blog URL and you can see your custom robots.txt file. After adding your custom robots.txt file you can submit your blog to search engines. Learn how to submit your blog to Google, Bing and Yahoo.
Hope this post clearly explained you about robots.txt and how to create and add custom robots.txt file in Blogger. Please share it and if you have any other doubts on Blogger robots.txt then feel free to comment below.