Search Engine Cloaking - Do it? Search Engine Cloaking - To Cloak or Not to Cloak By Sumantra Roy
Cloaking can broadly be defined as a technique used to deliver different web pages under different circumstances. There are two primary reasons that people use page cloaking:
i) It allows them to create a separate optimized page for each search engine and another page which is aesthetically pleasing and designed for their human visitors. When a search engine spider visits a site,
page which has been optimized for that search engine is delivered to it. When a human visits a site,
page which was designed for
human visitors is shown. The primary benefit of doing this is that
human visitors don't need to be shown
pages which have been optimized for
search engines, because
pages which are meant for
search engines may not be aesthetically pleasing, and may contain an over-repetition of keywords.
ii) It allows them to hide
source code of
optimized pages that they have created, and hence prevents their competitors from being able to copy
source code.
Page cloaking is implemented by using some specialized cloaking scripts. A cloaking script is installed on
server, which detects whether it is a search engine or a human being that is requesting a page. If a search engine is requesting a page,
cloaking script delivers
page which has been optimized for that search engine. If a human being is requesting
page,
cloaking script delivers
page which has been designed for humans.
There are two primary ways by which
script can detect whether a search engine or a human being is visiting a site:
i) The first and simplest way is by checking
User-Agent variable. Each time anyone (be it a search engine spider or a browser being operated by a human) requests a page from a site, it reports an User-Agent name to
site. Generally, if a search engine spider requests a page,
User-Agent variable contains
name of
search engine. Hence, if
cloaking script detects that
User-Agent variable contains a name of a search engine, it delivers
page which has been optimized for that search engine. If
cloaking script does not detect
name of a search engine in
User-Agent variable, it assumes that
request has been made by a human being and delivers
page which was designed for human beings.
However, while this is
simplest way to implement a cloaking script, it is also
least safe. It is pretty easy to fake
User-Agent variable, and hence, someone who wants to see
optimized pages that are being delivered to different search engines can easily do so.
ii) The second and more complicated way is to use I.P. (Internet Protocol) based cloaking. This involves
use of an I.P. database which contains a list of
I.P. addresses of all known search engine spiders. When a visitor (a search engine or a human) requests a page,
cloaking script checks
I.P. address of
visitor. If
I.P. address is present in
I.P. database,
cloaking script knows that
visitor is a search engine and delivers
page optimized for that search engine. If
I.P. address is not present in
I.P. database,
cloaking script assumes that a human has requested
page, and delivers
page which is meant for human visitors.