Nobody likes spam—mostly it is just annoying, but sometimes it can be harmful. Referrer spam in Google Analytics is annoying because it confuses the data, making it less useful and more difficult to understand. When you want to know how much traffic your site is getting, referrer spam can be a real problem.
What is Referrer Spam?
Referrer spam is fake referral traffic recorded in your Google Analytics. This fake traffic is created by spam bots, not actual visitors to your site.
What is a Spambot?
Crawler programs designed to perform repetitive tasks with accuracy are called bots. Bots whose purpose is to spam are called spam bots. Often, spambots access thousands of websites a day, sending out HTTP requests with a fake referrer header. This fake header contains the website URL for which the spammer wants to promote and build back links. When this occurs, the URL is recorded in your server log. This benefits the spammer because the Googlebot then treats this as a backlink which in turn affects search rankings.
Good Bots Versus Bad Bots
There are good bots and bad bots. The difference is in their purpose. Good bots have a constructive purpose, whereas bad bots usually have a destructive purpose.
- Good Bots. Not all bots are bad. Googlebot is a good bot that is used to crawl and index web pages. Google uses them to go through the many web pages and determine what should show up on the search engine results pages. These are good because they can search through vast amounts of information and create useable information.
- Bad Bots. Bots with nefarious purposes are bad bots. Some are used to commit fraud, harvest email addresses, scrape website contents, spread malware and artificially inflate website traffic. They can also create fake accounts, send spam emails, form a botnet and bypass CAPTCHAS.
- Data Integrity. Unfortunately, all bots threaten data integrity, no matter what their official purpose. Bots that are able to execute javascript can skew Google Analytics reports, making data obtained from direct traffic, referral traffic, bounce rate, or conversion reports inaccurate and less helpful.
- Botnet. A botnet is a network of infected computers that can be used to access your website through hundreds of different IP addresses. This makes it difficult to block spam bots by blocking particular IP addresses or limiting the rate of traffic sent or received. The larger the botnet, the more IP addresses the bot can use to access your website and bypass a firewall or other security. In some cases these spam bots will not leave referrer headers, making it difficult to detect them.
Suspicious Referrals
It is bad enough when spambots attempt to harvest email addresses or skew your analytics report, but often their purpose is even more nefarious—they want to infect your computer with malware or make your computer a part of a botnet. Never click on suspicious addresses in a referral report because your computer could get infected with malware. Once a computer is part of a botnet, it can be used to forward spam, viruses and malicious programs. If the entire botnet is subsequently blocked, it can also block real users since botnets are made up of real computers with real users.
Vulnerable Websites
Not all websites get attacked. Spam bots look for vulnerable websites and exploit them. To defend against them, get rid of your website’s weaknesses. Sites that are most vulnerable include those with:
- Cheap Shared Hosting Platform
- Custom CMS/Shopping Cart
If you find that you are getting attacked frequently, you may want to look into changing your hosting provider.
Steps to Get Rid of Spam
Spam is bad, but is there a way to get rid of it? There are several things you can do to protect your website and get rid of spam.
- Detect and Fix Referrer Spam. Run a referral report in Google Analytics and sort by bounce rate. Look for referrers with a 100% or 0% bounce rate and more than 10 sessions. They are most likely spam referrers. Confirm the identity by looking at lists of spammy referrers on the internet or make sure your virus software is up-to-date and visit the site. If it is a spammy referral block them with the custom advanced filter. Do not exclude them with the Referral exclusion list. This will not solve your problem because it will just make the spambot appear as direct traffic instead. There is no way to revert the Google Analytics report once a spambot has visited it.
- Block Referrer from Spambots. Go to your .htaccess file and add code to block the referrer address. For example, to block the website semalt.com:
RewriteEngine On
Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} ^https?://([^.]+\.)*semalt\.com\ [NC,OR] RewriteRule .* – [F] - Block IP Addresses used by Spambots. Go to your .htaccess file and add code to block the IP address. For example, to block the IP address 234.45.12.33:
RewriteEngine On
Options +FollowSymlinks
Order Deny,Allow
Deny from 234.45.12.33 - Block IP Address Ranges used by Spambots. Blocking a range of addresses takes less space on your server than blocking each individual address.
- Block Rogue User Agents. Go to your .htaccess file and add code to block the Rogue User Agent. For example to block Baiduspider:
RewriteEngine On
Options +FollowSymlinks
RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC] RewriteRule .* – [F,L] - Use Google Analytics Bot Filtering Feature. Go to reporting view settings and click exclude all hits from known bots and spiders.
Monitor Server Logs. Check your logs once a week. If you can stop the bots there, you won’t have to exclude them from your Google Analytics reports. - Use a Firewall. The firewall will protect you by putting a sort of filter between your computer and the internet.
- Contact System Administrator. Talk with your system administrator whenever you find a new bot. This is the person that should be dealing with invasions from bad bots.
- Use Google Chrome. Google Chrome detects malware more quickly than other browsers.
- Use Custom Alerts in Google Analytics. You can set your alerts to let you know about unusual spikes of activity. This can help you stay aware of potential threats from bot activity.
- Get Penetration Testing. If you regularly get bot traffic skewing your reports, it may be in your best interest to have your website tested for vulnerabilities through what is called penetration testing.
Protecting your website from invasion is an ongoing process that should be a priority. With the use of these ideas, your Google Analytics report can be a helpful way to learn about activity on your site.