Sebastian's Pamphlets

If you've read my articles somewhere on the Internet, expect something different here.

MOVED TO SEBASTIANS-PAMPHLETS.COM

Please click the link above to read actual posts, this archive will disappear soon!

Stay tuned...

Thursday, May 17, 2007

When your referrer stats turn into a porn TGP

When you wonder why your top referrers are porn galleries, make-you-rich-in-a-second scams and other pages which don't carry your link but try to sell you something, read further.

Referrer spamming is done by bots requesting pages from your site, leaving a bogus HTTP_REFERER. These spam bots come from various IPs, change their user agents on the fly, and use other sneaky techniques to slip thru spam protection. Some of them are somewhat clever by adjusting the number of bogus requests to your site by your Alexa stats to ensure their "visits" do appear on limited realtime referrer lists and other stats by referrer. Some of them even suck the whole pages from your server, and a few even follow redirects.

So what can you do? Not much. You can't really get rid of these log entries, because the logs are written before your spam protection handles those requests. But you can reduce the waste of bandwidth and server resources. If you redirect these requests, your server sends only a header, but not the contents. Here is a way to accomplish that:

First of all, extract the bogus referrers from your logs or stats pages, and save them in a plain text file:
Change this to a list of domains, truncating subdomains like "www" or "galleries", and add .htaccess code:

SetEnvIf Referer \.collegefuckfest\.com GoFuckYourself=1
SetEnvIf Referer \.asstraffic\.com GoFuckYourself=1
SetEnvIf Referer \.allinternal\.com GoFuckYourself=1
SetEnvIf Referer \.mature-lessons\.com GoFuckYourself=1
SetEnvIf Referer \.wildpass\.com GoFuckYourself=1
SetEnvIf Referer \.promote-biz\.net GoFuckYourself=1


This code will create an environment variable "GoFuckYourself" with the value "1". Following statements can now work with these marked requests:

RewriteCond %{ENV:GoFuckYourself} 1 [NC]
RewriteRule /* %{HTTP_REFERER} [R=301,L]


This redirects the request to its referrer, so if the bogus bot follows redirects, it will request a page from the spammer's domain. Of course you can redirect to a static URL too:
RewriteRule /* http://www.example.com/gofuckyourself [R=301,L]

You could also use the environment variable in deny statements
order allow,deny
allow from all
deny from env=GoFuckYourself

but that will serve a complete page, and may produce an infinite loop. Deny as well as the similar RewriteRule .* - [F] enforce a 403-Forbidden. Then if you've an ErrorDocument 403 /getthefuckouttahere.html directive, the request of the error page runs into the 403 itself - this process calls itself over and over until it gets terminated after 20 or so loops.

Labels: ,

Share this post at StumbleUpon
Stumble It!
    Share this post at del.icio.us
Post it to
del.icio.us
 


-->

1 Comments:

  • At Thursday, May 17, 2007, Blogger softplus said…

    Hi Sebastian
    You might also want to give eKstreme's CrawlerController a look, he covers things like this as well. http://ekstreme.com/phplabs/crawlercontroller.php

     

Post a Comment

Links to this post:

Create a Link

<< Home