Search Engine Cloaking - Do it? Search Engine Cloaking - To Cloak or Not to Cloak By Sumantra Roy
Cloaking can broadly be defined as a technique used to deliver different web pages under different circumstances. There are two primary reasons that people use page cloaking:
i) It allows them to create a separate optimized page for each search engine and another page which is aesthetically pleasing and designed for their human visitors. When a search engine spider visits a site, page which has been optimized for that search engine is delivered to it. When a human visits a site, page which was designed for human visitors is shown. The primary benefit of doing this is that human visitors don't need to be shown pages which have been optimized for search engines, because pages which are meant for search engines may not be aesthetically pleasing, and may contain an over-repetition of keywords.
ii) It allows them to hide source code of optimized pages that they have created, and hence prevents their competitors from being able to copy source code.
Page cloaking is implemented by using some specialized cloaking scripts. A cloaking script is installed on server, which detects whether it is a search engine or a human being that is requesting a page. If a search engine is requesting a page, cloaking script delivers page which has been optimized for that search engine. If a human being is requesting page, cloaking script delivers page which has been designed for humans.
There are two primary ways by which script can detect whether a search engine or a human being is visiting a site:
i) The first and simplest way is by checking User-Agent variable. Each time anyone (be it a search engine spider or a browser being operated by a human) requests a page from a site, it reports an User-Agent name to site. Generally, if a search engine spider requests a page, User-Agent variable contains name of search engine. Hence, if cloaking script detects that User-Agent variable contains a name of a search engine, it delivers page which has been optimized for that search engine. If cloaking script does not detect name of a search engine in User-Agent variable, it assumes that request has been made by a human being and delivers page which was designed for human beings.
However, while this is simplest way to implement a cloaking script, it is also least safe. It is pretty easy to fake User-Agent variable, and hence, someone who wants to see optimized pages that are being delivered to different search engines can easily do so.
ii) The second and more complicated way is to use I.P. (Internet Protocol) based cloaking. This involves use of an I.P. database which contains a list of I.P. addresses of all known search engine spiders. When a visitor (a search engine or a human) requests a page, cloaking script checks I.P. address of visitor. If I.P. address is present in I.P. database, cloaking script knows that visitor is a search engine and delivers page optimized for that search engine. If I.P. address is not present in I.P. database, cloaking script assumes that a human has requested page, and delivers page which is meant for human visitors.
Although more complicated than User-Agent based cloaking, I.P. based cloaking is more reliable and safe because it is very difficult to fake I.P. addresses.