Working with the robots.txt file

Written by Jagdeep.S.Pannu


Continued from page 1

Change file names:

Changerepparttar stats filename from index.htm to something different, such as stats- new.htm so that your stats URL now becomes www.domain.com/stats/stats-new.htm

Place a simple text file containingrepparttar 127956 text, "Sorry you are not authorized to view this page", and save it as index.htm in your /stats/directory.

This wayrepparttar 127957 snooper cannot guess your actual filename and get to your banned content.

Use login passwords: Password-protectrepparttar 127958 sensitive content listed in your robots.txt file.

Optimization ofrepparttar 127959 robots.txt file : -

The right commands in robots.txt : Use correct commands. Most common errors include - puttingrepparttar 127960 command meant for "User-agent" field inrepparttar 127961 "Disallow field" and vice-versa. Please also note that there is no "Allow" command. Content not blocked inrepparttar 127962 "Disallow" field is considered allowed. Currently, only two fields are recognized: "The User-agent field" andrepparttar 127963 "Disallow field". Experts are consideringrepparttar 127964 addition of more robot recognizable commands to makerepparttar 127965 robots.txt file more Webmaster and robot friendly.

Bad Syntax: Do not put multiple file URLs in one Disallow line inrepparttar 127966 robots.txt file. Use a new Disallow line for every directory that you want to block access to. Incorrect Robots.txt example :

User-agent: * Disallow: /concepts/ /links/ /images/

Correct robots.txt example:

User-agent: * Disallow: /concepts/ Disallow: /links/ Disallow: /images/

Files and directories: If a specific file has to be disallowed, end it withrepparttar 127967 file extension and without a forward slash inrepparttar 127968 end. Studyrepparttar 127969 following Robots.txt example :

For file:

User-agent: * Disallow: /hilltop.html

For Directory:

User-agent: * Disallow: /concepts/

Remember if you have to block access to all files inrepparttar 127970 directory, you don't have to specify each and every file in robots.txt . You can simply blockrepparttar 127971 directory as shown above. Another common error is leaving outrepparttar 127972 slashes altogether. This would leave a very different message than intended.

The right location forrepparttar 127973 robots.txt file: No robot will access a badly placed robots.txt file. Make sure thatrepparttar 127974 location is www.domain.com/robots.txt.

Capitalization in robots.txt : Never capitalize your syntax commands. Directory and filenames are case sensitive in Unix platforms. The only capitals used per standard are: "User-agent " and "Disallow "

Correct Order for robots.txt : If you want to block access to all but one or more than one robot, thenrepparttar 127975 specific ones should be mentioned first. Lets study this robots.txt example :

User-agent: * Disallow: /

User-agent: googlebot Disallow:

Inrepparttar 127976 above case, Googlebot would simply leaverepparttar 127977 site without indexing after readingrepparttar 127978 first command. Correct syntax is:

User-agent: googlebot Disallow:

User-agent: * Disallow: /

The robots.txt file : Not having a robots.txt file at all could generate a 404 error for search engine robots, which could redirectrepparttar 127979 robot torepparttar 127980 default 404-error page or your customized 404-error page. If this happens seamlessly, it is up torepparttar 127981 robot to decide ifrepparttar 127982 target file is a robots.txt file or an html file. Typically it would not cause many problems but you may not want to risk it. It's always a better idea to putrepparttar 127983 standard robots.txt file inrepparttar 127984 root directory, than not having it at all.

The standard robots.txt file for allowing all robots to index all pages is:

User-agent: * Disallow:

Using # Carefully inrepparttar 127985 robots.txt file: Adding comments afterrepparttar 127986 syntax commands is not a good idea using "#". Some robots might misinterpretrepparttar 127987 line although it is acceptable as perrepparttar 127988 robots exclusion standard. New lines are always preferred for comments.

Usingrepparttar 127989 robots.txt file : -

Robots are configured to read text. Too much graphic content could render your pages invisible torepparttar 127990 search engine. Userepparttar 127991 robots.txt file to block irrelevant and graphic-only content.

Indiscriminate access to all files, it is believed, can dilute relevance to your site content after being indexed by robots. This could seriously affect your site's ranking with search engines. Userepparttar 127992 robots.txt file to direct robots to content relevant to your site's theme by blockingrepparttar 127993 irrelevant files or directories.

The robots.txt file can be used for multilingual websites to direct robots to relevant content for relevant topics for different languages. It ultimately helpsrepparttar 127994 search engines to present relevant results for specific languages. It also helpsrepparttar 127995 search engine in its advanced search options where language is a variable.

Some robots could cause severe server loading problems by rapid firing too many requests at peak hours. This could affect your business. By excluding some robots that might be irrelevant to your site, inrepparttar 127996 robots.txt file, this problem can be taken care of. It is really not a good idea to let malevolent robots use up precious bandwidth to harvest your emails, images etc.

Userepparttar 127997 robots.txt file to block out folders with sensitive information, text content, demo areas or content yet to be approved by your editors before it goes live. The robots.txt file is an effective tool to address certain issues regarding website ranking. Used in conjunction with other SEO strategies, it can significantly enhance a website's presence onrepparttar 127998 net.

Related Reading : -

A Standard for Robots Exclusion.

Guide to The Robots Exclusion Protocol

W3C Recommendations

Article last updated : 11th March 2004

(c) Copyright 2004 Jagdeep.S. Pannu, SEORank ----------------------------------------------

This Article is Copyright protected. If you have comments; or would like to have this article republished on your site, please contactrepparttar 127999 author here: SEO Articles Feedback. We just require all due credits carried; and text, hyperlinks and headers unaltered. This article must not be used in unsolicited mail.

Jagdeep.S.Pannu is Manager-Online Marketing, at www.SEORank.com, a leading Search Engine Optimization Services Company.


The Basics of Search Engine Optimization (S.E.O.)

Written by Jeff McIntire-Strasburg, Ph.D.


Continued from page 1

It’s also important to use those keywords and phrases inrepparttar copy ofrepparttar 127955 page, as spiders do look at this also. Note, for instance, how often I use phrases like “search engine opitimization” and “search engine” in this article. That’s deliberate. Keep in mind, though, that search engines have caught on torepparttar 127956 practice of loading down copy with keywords to achieve higher rankings, and certain high ratios of keywords to overall copy may actually hurt your placement.

The third place to use keywords is inrepparttar 127957 site’s meta tags, which are part ofrepparttar 127958 coding forrepparttar 127959 page. Again, it’s best to use onlyrepparttar 127960 words that will bringrepparttar 127961 best results – repetition or overuse of variations can hurt your optimization.

Finally, some consultants will suggest that regardless of your site’s content, you always includerepparttar 127962 most popular overall keywords in your title and meta tags. While this may bring more initial traffic to your site, you have to consider whether attracting searches on “Eminem” or “Spiderman” will benefit you in any way. Generally, it’s best to optimize your site so that you attract those searchers that would actually be interested in what you have to offer.

Link popularity

Another criteria that many search engines use for ranking is link popularity. Essentially, is your site linked from other sites, and are those sites ranked well in their engines? This can be a time consuming process for a webmaster, as getting links on other pages generally involves contactingrepparttar 127963 owner of that page and asking for a listing or agreeing to a “link swap”: you put a link on your page to their site, and they dorepparttar 127964 same for you. Again, don’t believe that you can foolrepparttar 127965 search engines through short-cut methods such as FFA pages, asrepparttar 127966 engine administrators have caught on to these, also. Also keep in mind that links on other reputable pages account for a high percentage of traffic – you really can’t lose by having your site listed on other site’s “Links” pages.

Some Final Thoughts

When optimizing your site forrepparttar 127967 search engines, it’s also important to remember that a high ranking in a search doesn’t necessarily mean more qualified traffic. As in many aspects of life, presentation is everything. Ifrepparttar 127968 listing onrepparttar 127969 search engine appears clear and professional, you’re more likely to receive more ofrepparttar 127970 traffic that will benefit you. Boutin notes thatrepparttar 127971 meta name field inrepparttar 127972 coding is important in this regard, as some engines userepparttar 127973 information in this field for their site description. While you want to use keywords and phrases in this field, do so in a manner that still will make sense to a surfer looking for a site like yours.

Ultimately, search engine optimization involves using tried and true methods of design and writing to make your site user-friendly.

More information:

Boutin, Paul. “Search Engine Optimization FREE” http://hotwired.lycos.com/webmonkey/01/23/index1a.html

Jeff McIntire-Strasburg, Ph.D., is an English professor and freelance business writer. You may contact Jeff at mcintirj@lincolnu.edu


    <Back to Page 1
 
ImproveHomeLife.com © 2005
Terms of Use