Continued from page 1
For simplicity, let's assume that you are targeting only two keywords: "tourism in Australia" and "travel to Australia". Also, let's assume that you are targeting only three of major search engines: AltaVista, HotBot and Google.
Now, suppose you have followed following convention for naming files: Each page is named by separating individual words of keyword for which page is being optimized by hyphens. To this is added first two letters of name of search engine for which page is being optimized.
Hence, files for AltaVista are
tourism-in-australia-al.html travel-to-australia-al.html
The files for HotBot are
tourism-in-australia-ho.html travel-to-australia-ho.html
The files for Google are
tourism-in-australia-go.html travel-to-australia-go.html
As I noted earlier, AltaVista's spider is called Scooter and Google's spider is called Googlebot.
A list of spiders for major search engines can be found here.
Now, we know that HotBot uses Inktomi and from this list, we find that Inktomi's spider is called Slurp.
Using this knowledge, here's what robots.txt file should contain:
User-Agent: Scooter Disallow: /tourism-in-australia-ho.html Disallow: /travel-to-australia-ho.html Disallow: /tourism-in-australia-go.html Disallow: /travel-to-australia-go.html
User-Agent: Slurp Disallow: /tourism-in-australia-al.html Disallow: /travel-to-australia-al.html Disallow: /tourism-in-australia-go.html Disallow: /travel-to-australia-go.html
User-Agent: Googlebot Disallow: /tourism-in-australia-al.html Disallow: /travel-to-australia-al.html Disallow: /tourism-in-australia-ho.html Disallow: /travel-to-australia-ho.html
When you put above lines in robots.txt file, you instruct each search engine not to spider files meant for other search engines.
When you have finished creating robots.txt file, double-check to ensure that you have not made any errors anywhere in it. A small error can have disastrous consequences - a search engine may spider files which are not meant for it, in which case it can penalize your site for spamming, or, it may not spider any files at all, in which case you won't get top rankings in that search engine.
An useful tool to check syntax of your robots.txt file can be found here. While it will help you correct syntactical errors in robots.txt file, it won't help you correct any logical errors, for which you will still need to go through robots.txt thoroughly, as mentioned above.
Article by Sumantra Roy. Sumantra is one of the most respected and recognized search engine positioning specialists on the Internet. For more articles on search engine placement, subscribe to his 1st Search Ranking Newsletter by going to: http://the-easy-way.com/newsletter.html