Continued from page 1
Along with identifying individual robots and counting
number of their visits,
statistics can also show you aggressive bandwidth-grabbing robots or robots you may not want visiting your website. In
resources section of
end of this article, you will find sites that list names and IP addresses of search engine robots to help you identify them.
How Do They Read The Pages On Your Website? When
search engine robot visits your page, it looks at
visible text on
page,
content of
various tags in your page's source code (title tag, meta tags, etc.), and
hyperlinks on your page. From
words and
links that
robot finds,
search engine decides what your page is about. There are many factors used to figure out what "matters" and each search engine has its own algorithm in order to evaluate and process
information. Depending on how
robot is set up through
search engine,
information is indexed and then delivered to
search engine's database.
The information delivered to
databases then becomes part of
search engine and directory ranking process. When
search engine visitor submits their query,
search engine digs through its database to give
final listing that is displayed on
results page.
The search engine databases update at varying times. Once you are in
search engine databases,
robots keep visiting you periodically, to pick up any changes to your pages, and to make sure they have
latest info. The number of times you are visited depends on how
search engine sets up its visits, which can vary per search engine.
Sometimes visiting robots are unable to access
website they are visiting. If your site is down, or you are experiencing huge amounts of traffic,
robot may not be able to access your site. When this happens,
website may not be re-indexed, depending on
frequency of
robot visits to your website. In most cases, robots that cannot access your pages will try again later, hoping that your site will be accessible then.
Resources
SpiderSpotting - Search Engine Watch
http://searchenginewatch.com/webmasters/spiders.html
Robotstxt.org
List of robots and protocols for setting up a robots.txt file. http://www.robotstxt.org/
Spider-Food
Tutorials, forums and articles about Search Engine spiders and Search Engine Marketing. http://spider-food.net/
Spiderhunter.com
Articles and resources about tracking Search Engine spiders. http://www.spiderhunter.com/
Sim Spider Search Engine Robot Simulator
Search Engine World has a spider that simulates what
Search Engine robots read from your website. http://www.searchengineworld.com/cgi-bin/sim_spider.cgi

Daria Goetsch is the founder and Search Engine Marketing Consultant for Search Innovation Marketing (www.searchinnovation.com), a Search Engine Promotion company serving small businesses. She has specialized in search engine optimization since 1998, including three years as the Search Engine Specialist for O'Reilly & Associates, a technical book publishing company.