Antezeta LogoAntezeta Web Marketing

Reflections on search engine optimization, web analytics and web marketing

Antezeta Web Marketing header image 2

Tracking Search Engine Cache Page Views with Web Analytics

by sean · No Comments

A small percentage of search engine users may view a web site using a search engine’s saved copy of site pages, their cached version. The cached copy the search engine serves to the user usually contains links to embedded objects present in the original site: images, CSS stylesheets, javascript, etc. Organizations focusing on web marketing activities, such as search engine optimization, will want to track all search engine activity, including cached page views.

Referrers from the search engine’s cached copy will show up in the site’s web server log files, including the keywords and keyword phrases used to find the cached copy. In some cases, the user will click through to the original website, viewing a real page with cache referring information in the web server log file.

Cache views are more difficult for Web Analytics software to recognize, but it can be done.

A Web Analytics tool must dissect the search engine referring URL as in this example:

http://64.233.179.104/search?q=cache:l5D4yOKeZaYJ:www.antezeta.com/search-engines-site-
localization-duplicate-content.html+google+dialect&hl=en&ct=clnk&cd=9

Item Description
http://64.233.179.104/ A known Google IP address.
search The Google Service. Others you may see include translate_c
q=cache:l5D4yOKeZaYJ: Indicates a query, made to an item in cache. The cache ID is a 12 character alphanumeric string.
www.antezeta.com Domain containing item matching query terms
search-engines-site-localization-duplicate-content.html Object matching query terms (html page, pdf…)
google dialect Query words entered by user
hl=en Google Interface Human Language code (English)
ct=clnk Not needed
cd=9 Not needed

In some cases, a user may view a search engine’s cached copy of a page without entering search words in a search engine. How? Through a search engine browser toolbar. Such a referrer will look like this example:

http://72.14.207.104/search?sourceid=navclient&ie=UTF-8&rls=GGLG,GGLG:2005-50,GGLG:
en&q=cache:http%3A%2F%2Fwww.antezeta.com%2Fawstats.html

We have added logic to the AWStats Web Analytics application Search Engine Recognition Module to better recognize Search Engine Cache query terms, page views and click-throughs to a site.

  1. Google Service IPs list has been increased. To do: find definitive list
  2. Introduced logic to parse search keywords. Currently only works for Google cache IDs without numbers. The main AWStats program will probably have to be modified to recognize alphanumeric cache IDs.
  3. Google Translate traffic is currently included in Google Cache traffic. Ideally, this would be separated out. It appears again that this will require a change to the main AWStats program.

Yahoo!, Ask and MSN

Was this article useful? If so, spread the word:
  • Sphinn
  • StumbleUpon
  • Reddit
  • Digg
  • FriendFeed
  • Wikio
  • del.icio.us
  • Mixx
  • Google Bookmarks
  • Slashdot
  • Technorati
  • TwitThis
  • Facebook
  • Diigo
  • Netvibes
  • NewsVine
  • HelloTxt
  • Tumblr
  • Yahoo! Bookmarks
  • E-mail this story to a friend!
  • Suggest to Techmeme via Twitter
  • Yahoo! Buzz

If you're new here, please subscribe to my RSS feed or subscribe to me on Twitter, which is updated on a more frequent – and more meaningless – basis in English and Italian. Finally, if you're a Sphinn user, Sphinn love is welcome :-). Thanks for visiting!

Originally published June 24th, 2006 Tags: ···


0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

Leave a Comment

Warning: Comments are welcome insofar as they add something to the discussion. Anonymous and/or polemical comments without a rational justification of the author's position risk being mercilessly deleted at the sole discretion of the administrator. Yes, life is hard :-).

*
To prove you're a person (not a spam script), type the answer to the math equation shown in the picture. Click on the picture to hear an audio file of the equation.
Click to hear an audio file of the anti-spam equation