Sebastian's Pamphlets

If you've read my articles somewhere on the Internet, expect something different here.

MOVED TO SEBASTIANS-PAMPHLETS.COM

Please click the link above to read actual posts, this archive will disappear soon!

Stay tuned...

Saturday, July 28, 2007

Analyzing search engine rankings by human traffic

Recently I've discussed ranking checkers at several places, and I'm quite astonished that folks still see some value in ranking reports. Frankly, ranking reports are --in most cases-- a useless waste of paper and/or disk space. That does not mean that SERP positions per keyword phrase aren't interesting. They're just useless without context, that is traffic data. Converting traffic pays the bills, not sole rankings. The truth is in your traffic data.

That said, I'd like to outline a method to get a particular useful information out of raw traffic data: underestimated search terms. That's not a new attempt, and perhaps you have the reports already, but maybe you don't look at the information which is somewhat hidden in stats ordered by success, not failure. And you should be --respective employ-- a programmer to implement it.


The first step is gathering data. Create a database table to record all hits, then in a footer include or so, when the complete page got outputted already, write all data you have in that table. All data means URL, timestamp, and variables like referrer, user agent, IP, language and so on. Be a data rat, log everything you can get hold of. With dynamic sites it's easy to add page title, (product) IDs etcetera, with static sites write a tool to capture these attributes separately.

For performance reasons it makes sense to work with a raw data table, which has just a primary key, to log the requests, and normalized working tables which have lots of indexes to allow aggregations, ad hoc queries, and fast reports from different perspectives. Also think of regular purging the raw log table and historization. While transferring raw log data to the working tables in low traffic hours or on another machine you can calculate interesting attributes and add data from other sources which were not available to the logging process.

You'll need that traffic data collector anyway for a gazillion of purposes where your analytics software fails, is not precise enough, or just can't deliver a particular evaluation perspective. It's a prerequisite for the method discussed here, but don't build a monster sized cannon to chase a fly. You can gather search engine referrer data from logfiles too.


For example an interesting information is on which SERP a user clicked a link pointing to your site. Simplified you need three attributes in your working tables to store this info: search engine, search term, and SERP number. You can extract these values from the HTTP_REFERER.

http://www.google.com/search?q=keyword1+keyword2~
&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

1. "google" in the server name tells you the search engine.
2. The "q" variable's value tells you the search term "keyword1 keyword2".
3. The lack of a "start" variable tells you that the result was placed on the first SERP. The lack of a "num" variable lets you assume that the user got 10 results per SERP, so it's quite safe to say that you rank in the top 10 for this term. Actually, the number of results per page is not always extractable from the URL because it's pulled from a cookie usually, but not so many surfers change their preferences (e.g. less than 0.5% surf with 100 results according to JohnMu and my data as well). If you've got a "num" value then add 1 and divide the result by 10 to make the data comparable. If that's not precise enough you'll spot it afterwards, and you can always recalculate SERP numbers from the canned referrer.

http://www.google.co.uk/search?q=keyword1+keyword2~
&hl=en&start=10&sa=N

1. and 2. as above.
3. The "start" variable's value 10 tells you that you got a hit from the second SERP. When start=10 and there is no "num" variable, most probably the searcher got 10 results per page.

http://www.google.es/search?q=keyword1+keyword2~
&rls=com.microsoft:*&ie=UTF-8&oe=UTF-8&startIndex=~
&startPage=1

1. and 2. as above.
3. The empty "startIndex" variable and startPage=1 are useless, but the lack of "start" and "num" tells you that you've got a hit from the 1st spanish SERP.

http://www.google.ca/search?q=keyword1+keyword2~
&hl=en&rls=GGGL,GGGL:2006-30,GGGL:en&start=20~
&num=20&sa=N

1. and 2. as above.
3. num=20 tells you that the searcher views 20 results per page, and start=20 indicates the second SERP, so you rank between #21 and #40, thus the (averaged) SERP# is 3.5 (provided SERP# is not an integer in your database).

You got the idea, here is a cheat sheet and official documentation on Google's URL parameters. Analyze the URLs in your referrer logs and call them with cookies off what disables your personal search preferences, then play with the values. Do that with other search engines too.


Now a subset of your traffic data has a value in "search engine". Aggregate tuples where search engine is not NULL, then select the results for example where SERP number is lower or equal 3.99 (respectively 4), ordered by SERP number ascending, hits descending and keyword phrase, break by search engine. (Why sorted by traffic descending? You have a report of your best performing keywords already.)

The result is a list of search terms you rank for on the first 4 SERPs, beginning with keywords you've probably not optimized for. At least you didn't optimize the snippet to improve CTR, so your ranking doesn't generate a reasonable amount of traffic. Before you study the report, throw away your site owner hat and try to think like a consumer. Sometimes those make use of a vocabulary you didn't think of before.

Research promising keywords, and decide whether you want to push, bury or ignore them. Why bury? Well, in some cases you just don't want to rank for a particular search term, [your product sucks] being just one example. If the ranking is fine, the search term smells somewhat lucrative, and just the snippet sucks in a particular search query's context, enhance your SERP listing.

Every once in a while you'll discover a search term making a killing for your competitors whilst you never spotted it because your stats package reports only the best 500 monthly referrers or so. Also, you'll get the most out of your rankings by optimizing their SERP CTRs.


Be crative, over time your traffic database becomes more and more valuable, allowing other unconventional and/or site specific reports which off-the-shelf analytics software usually does not deliver. Most probably your competitors use standard analytics software, individually developed algos and reports can make a difference. That does not mean you should throw away your analytics software to reinvent the wheel. However, once you're used to self developed analytic tools you'll think of more interesting methods not only to analyse and monitor rankings by human traffic than you can implement in this century ;)


Bear in mind that the method outlined above does not and cannot replace serious keyword research.


Another --very popular-- approach to get this info would be automated ranking checks mashed up with hits by keyword phrase. Unfortunately, Google and other engines do not permit automated queries for the purpose of ranking checks, and this method works with preselected keywords, that means you don't find (all) search terms created by users. Even when you compile your ranking checker's keyword lists via various keyword research tools, you'll still miss out on some interesting keywords in your seed list.


Related thoughts: Why regular and automated ranking checks are necessary when you operate seasonal sites by Donna

Labels: , , , ,

Share this post at StumbleUpon
Stumble It!
    Share this post at del.icio.us
Post it to
del.icio.us
 


-->

2 Comments:

  • At Thursday, August 09, 2007, Blogger Justin Goldberg said…

    You can use hittail, and save yourself a lot of pain hunting for those keywords.

     
  • At Monday, August 13, 2007, Anonymous Anonymous said…

    This is an excellent post, worth in-depth analyzing. Impressive how much you can learn just by browsing the web and landing on ... your page. Thank you for sharing this.

     

Post a Comment

<< Home