custom referrer tracking

i just finished working on a neat referrer tracking page that displays which pages people are visiting on my site–and from whence they came, even decoding the keywords of search engine queries. check it out!

how does referrer tracking work?

unless you’ve gone out of your way to disable this feature (usually for privacy reasons), every time you click a link on a webpage, the url of that page gets sent to the webserver of the link you clicked.

for example if you click on a search result in google, the website you arrive at “knows” the url of your google query you came from–which contains the keywords you typed into google. there is even a blog plugin which highlights these keywords to make it easier for visitors to find what they were searching for.

in order to track the sites (like google) that refer vistors to your site, a service like extreme tracking takes advantage of the fact that javascript has access to the referrer URL of the current webpage. it encodes this url as a query string in what becomes a separate request for a single pixel transparent gif located on extreme tracking’s servers.

they happily provide the gif image while logging the encoded referrer url and other details of the “hit” in their database. they use this information primarily to provide a nice, free interface (with those “lucrative” google ads), detailing information about the last 20 referrers as well as cumulative hit counts over various intervals.

if you’re interested or curious to know what keywords people might be typing into google in order to find your website, that referrer information is golden. however once you start getting more than 20 hits a day, extreme tracking’s free service loses its luster.

so what did you do exactly?

a while ago i wrote a custom apache logging program that stored the raw http request information in a database, but it turns out there are these evil referrer spamming programs (e.g. Reffy) which send fake requests to webservers with referrer urls for various viagra and texas hold’em sites. so all the data i collected was littered with hundreds of bogus requests. that was a dead end.

as it turns out, extreme tracking’s use of javascript means that hits are only logged when a webpage is rendered in a web browser, causing that little 1 pixel gif to be requested. so i modeled my second attempt on that principle. i put a line of javascript in all of my webpages which triggers an image request that happens to be a php script that logs the referrer information and returns a single pixel gif. easy as pie.

then i wrote a little bit of php to decode the search engine query strings (including google’s image search!) so that i could see what search terms were bringing people to which pages on my site. eventually i may add filters and most requested lists, but right now i’m just enjoying reloading the referrer page to get a real time picture of the requests coming to my site.

can i see the code?

sure! well, actually it’s in pretty rough shape. but why not?
15-Feb-2005 v0.1 custom referrer tracking

note to windows users: first save the tar.gz file to your desktop, then extract with winzip to view the files.

Feel free to if you found this useful.

11 Comments

jackie

justinsomnia: the ultimate source for piggyback rides, ulnar nerve and chan marshall

Yah really, what’s with all the Chan Marshall?

corey

didn’t you spell referer wrong in this context?
http://en.wikipedia.org/wiki/Referer

re: chan marshall, a while ago i linked to a picture on another site of chan, and because google’s image search seems to update its index less frequently, they show that image, but refer to me as the “source.”

corey, damn, you’re probably right. i didn’t think to look it up in wikipedia. i just looked it up in the dictionary, found that “referer” was a misspelling, and went with referrer.

Nice. You’re too smart for me. I use Shaun Inman’s ShortStat. See it in use at mistersugar.com/shortstat.

Bill

Very interesting to say the least. I just got interested after I noticed how jib jab was tracking my movement.

Thanks for the insight.

This looks wonderful but appears a tad scary for me to try implementing. I wish there was a nice easy plugin available to make a referrers page because it’s interesting to see what folks are searching for. But hey, I’m a poet – the fact that I do anything technical at all is a mystery to my peers! >;-)

Thanks for the code. I’ve added this to my site as well – it is a nice simple way to track referrers.

I was playing around with this code, and made some minor adjustments: I use a DHTML slider to set the range of days over which to track, and I allow you to choose to get a count of the unique referrers or requests. Only the referrer php code has changed. You can see it in action at http://fugutabetai.com/referrer2.php and you can get the code from http://fugutabetai.com/referrer2.txt (putting the code up like that probably isn’t a good idea, since I don’t like how the database.php is included in the file so obviously, but whatever.) I don’t believe there are any possibilities for remote exploits since I make sure that the only user-settable parameter is a number. Anyway, thanks for the great JS code and the nice base for some fun referrer tracking. This does everything that I want for logging and is a hell of a lot easier than using a stats package for log parsing.

fugutabetai, you’re welcome. thanks for posting the links to your improvements.

man this is old code. I haven’t really even looked at it in ages, though it’s been running on my site for ages (since feb 05). and I still use it to track traffic stats and referrers. hmm, that’s means I have 2 years of data. sounds like a project to me.

The problem I have is that there’s so many types of search engines—which happen to be the referrers I’m least interested in, it’s a neverending process of filtering them out.

Just want to say thanks for the referrer tracking scipt. Was a great basis to start my work from!

Thx man!

Care to Comment?

Or if you'd prefer to get in touch privately, please send me an email.

Name

Email (optional)

Blog (optional)