As many of you may realize, the debate over serving Web spiders unique content (aka "cloaking") has been a hot topic at SitePoint.com , and the forums have been crammed with posts. Some people supporting the practice, others shun it, and many warn of the damage it can do to your reputation with the search engines (banning is a common recourse against sites that "cloak").
I’m going to show you some advanced algorithms I’ve used to cloak my own pages in ASP, which will hopefully be of interest to both people who don’t believe in cloaking, and those who do.
Traditional methods in cloaking involve detecting the
USER_AGENT and then serving an alternative page to the user based on this criterion. You can obtain a list of common User Agents here. The page you served would of course be rich in META Tags and the words that you wish to optimize your site for. The code for achieving this is as follows. It should appear at the top of your page:
<% Dim spidercheck Spidercheck = Request.ServerVariables("HTTP_USER_AGENT") If Spidercheck = "Googlebot" Then Server.Redirect("spiderrichpage.asp") %>
Those of you who are familiar with ASP will be able to extend this code to redirect when it encounters any of the known spider names. You can find a detailed list of names here.
What do my Visitors See?
As all code is performed on the server side, your visitors won’t see anything. The spiders will go in one direction and your visitors will see the page that the code resides on. However, remember how we sent the spider to spiderrichpage.asp? This page contains meta-tags and content that relate to your site. This is the page that the spider will index and store in its database; therefore it will also be the page that appears in the search results.
As we don’t want your potential visitors to see an ugly META Tag page containing spider-centric content, we also have to put code on that page to send your visitors to the page you want them to see.
So, at the top of spiderrichpage.asp we insert:
<% Dim spidercheck Spidercheck = Request.ServerVariables("HTTP_USER_AGENT") If Spidercheck <> "Googlebot" Then Server.Redirect("visitorpage.asp") %>
Ok, so we’ve served up a page for a particular spider name. But what if the search engines are fighting back? Suppose they try to fool our scripts by using spider names which are nearly identical to the user agents used to identify browsers. Anyone who maintains Web logs will recognize the following user agents:
Mozilla/5.0(compatible; MSIE 6.0; Windows 2000)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; MSIECrawler)
Which Arachnid is Which?
Spot the spider? Yes that’s right! The one with MSIECrawler as its User Agent is the spider. Many spiders try to stay nice and close to the browser names so they don’t get spotted and redirected. How can you combat this? Well, ASP has a nice function already built into the language to help us out!
IN STRING, and with this we can detect if words such as CRAWLER or BOT or SPIDER appear within the User Agent. Another problem with my original code is that it only detects one spider at a time (unless you extend it to use
IF THEN ELSEstatements). I'm now going to attempt to kill two spiders with one can of fly spray! The dictionary object can be easily added to as more spider names appear in your site logs:
<% Sub AddViolation(objDict, strWord) 'Adds a violation (a robot in this case) objDict.Add strWord, False End Sub Function CheckStringForViolations(strString, objDict) 'Determines if the string strString has any violations Dim bolViolations bolViolations = False Dim strKey For Each strKey in objDict If InStr(1, strString, strKey, vbTextCompare) > 0 bolViolations = True objDict(strKey) = True End If Next CheckStringForViolations = bolViolations End Function Dim objDictViolations Set objDictViolations = Server.CreateObject("Scripting.Dictionary") AddViolation objDictViolations, "Googlebot" AddViolation objDictViolations, "Lycos" AddViolation objDictViolations, "Ultraseek" AddViolation objDictViolations, "Sidewinder" AddViolation objDictViolations, "InfoSeek" AddViolation objDictViolations, "Scooter" AddViolation objDictViolations, "WebCrawler" AddViolation objDictViolations, "UTV" Dim strCheck, strKey strCheck = Request.ServerVariables("HTTP_USER_AGENT") If Len(strCheck) > 0 then If CheckStringForViolations(strCheck, objDictViolations) then Response.Redirect("spiderrichpage.asp") Else Response.Redirect("userpage.asp") End If End If %>
Now we have a way to add spiders as they come along, and to serve them up a special page. You advanced coders out there will be able to send different pages for different spiders -- I'll leave that as a learning exercise.
Innovative Algorithm This next algorithm is quite interesting. As all Web designers and developers know, META Tags still play a small roll in some search engine placement systems. Well, what if we could change our meta-tags to match whatever the spiders are searching for? Yes, this is possible. We can use the
HTTP_REFERRERto determine what the search spiders are looking for, and then write these into both our meta-tags and, if we wanted to, our main content.
<% If InStr( Request.ServerVariables("HTTP_REFERER"), "google") > 0 Then KeyURL = Request.ServerVariables("HTTP_REFERER") 'KeyURL = "http://www.google.com/search?hl=en&ie= ISO-8859-1&q= spider+food+for+fun&btnG=Google%20Search" ' Remove all up to q= KeyLen = Len(KeyURL) kStart = InStr( KeyURL, "q=" ) kStart = kStart + 1 KeyRight = KeyLen - kStart Keyword = Right( keyURL, KeyRight ) ' Check for trailing query string and remove text If Instr(Keyword, "&") > 0 Then kEnd = InStr(Keyword, "&") kEnd = kEnd - 1 Keyword = Left( Keyword, kEnd ) End If ' Turn encoding into text phrase Keyword = Replace(Keyword, "+"," ") Keyword = Replace(Keyword, ",",", ") Keyword = "," & Keyword End If %>
How do I use this? You can now write this keyword into your content using Response.Write(keyword). This could be in the space for your meta-tags, or it could be in an html layer hidden off the page. Tips and Tricks The code provided in this article can be easily modified and with a bit of creativity you could design some interesting algorithms. You could modify it to serve different pages for each spider, and maybe keep track of them in a database or design a statistics system for the Web community to display the frequency of the spider visits to your site. There are many great promotional tools for optimizing the pages you plan on serving up to the spiders. The links used in this article are listed below, along with a few extras that may be of interest to you in optimizing pages.