Continued from page 1
Along with identifying individual robots and counting number of their visits, statistics can also show you aggressive bandwidth-grabbing robots or robots you may not want visiting your website. In resources section of end of this article, you will find sites that list names and IP addresses of search engine robots to help you identify them.
How Do They Read The Pages On Your Website? When search engine robot visits your page, it looks at visible text on page, content of various tags in your page's source code (title tag, meta tags, etc.), and hyperlinks on your page. From words and links that robot finds, search engine decides what your page is about. There are many factors used to figure out what "matters" and each search engine has its own algorithm in order to evaluate and process information. Depending on how robot is set up through search engine, information is indexed and then delivered to search engine's database.
The information delivered to databases then becomes part of search engine and directory ranking process. When search engine visitor submits their query, search engine digs through its database to give final listing that is displayed on results page.
The search engine databases update at varying times. Once you are in search engine databases, robots keep visiting you periodically, to pick up any changes to your pages, and to make sure they have latest info. The number of times you are visited depends on how search engine sets up its visits, which can vary per search engine.
Sometimes visiting robots are unable to access website they are visiting. If your site is down, or you are experiencing huge amounts of traffic, robot may not be able to access your site. When this happens, website may not be re-indexed, depending on frequency of robot visits to your website. In most cases, robots that cannot access your pages will try again later, hoping that your site will be accessible then.
Resources
SpiderSpotting - Search Engine Watch
http://searchenginewatch.com/webmasters/spiders.html
Robotstxt.org
List of robots and protocols for setting up a robots.txt file. http://www.robotstxt.org/
Spider-Food
Tutorials, forums and articles about Search Engine spiders and Search Engine Marketing. http://spider-food.net/
Spiderhunter.com
Articles and resources about tracking Search Engine spiders. http://www.spiderhunter.com/
Sim Spider Search Engine Robot Simulator
Search Engine World has a spider that simulates what Search Engine robots read from your website. http://www.searchengineworld.com/cgi-bin/sim_spider.cgi
Daria Goetsch is the founder and Search Engine Marketing Consultant for Search Innovation Marketing (www.searchinnovation.com), a Search Engine Promotion company serving small businesses. She has specialized in search engine optimization since 1998, including three years as the Search Engine Specialist for O'Reilly & Associates, a technical book publishing company.