How to Filter Referral Spam in Google Analytics
Apr 12, 2016|Read time: 7 min.
Key Points
- Bloated and fake traffic hurts metrics and accurate reporting.
- The Analytics filter fix is the easiest and most effective.
- Ghost spam is a nuisance and will need continual monitoring.
Have you ever noticed a big traffic spike in your Google Analytics (GA) account for referral traffic? Well, it was probably bot or ghost traffic spam, and it’s a huge pain because it throws off your metrics. In this post I’ll explain how to filter referral spam in Google Analytics so you can focus on the traffic that actually converts.
Ghost spam is an old problem
It’s not happening to just you. Ghost spam has been plaguing webmasters for some time. Sadly, Google has done little to find a solution, other than to provide a checkbox to filter known bots and spiders.
But that doesn’t get rid of ghost bots because they never actually visit or request any information from your website. The log of site visitors goes directly to your GA account so you can’t filter by .htaccess files on apache to remedy the issue. Here’s a screenshot of what an unnatural spike might look like:
Referral spam is like comment spam; it’s a robotic shortcut that exists to generate traffic, leads, and much more with malicious intent.
How to Remove Referral Spam in Google Analytics
While it may not present immediate harm, don’t ignore it. Ghost spam causes many problems, including increased server loads, traffic inflation, inaccurate bounce rates and conversions, as well as traffic medium breakdowns. You have to fight back and preserve your analytics data. It’s the only way to make accurate conclusions and informed business decisions.
Preserve Historical Data & Set Up A Copied View
This is very important. First, you must create a separate view to use for your new filtered data. This will allow you to preserve your existing historical view and unfiltered data for cross referencing. In Google Analytics, go to: Admin > View Settings > Copy View. A screenshot is provided below for reference:
We can use that newly-copied view to filter referral traffic spam from and bot traffic. When GA prompts you to name the filter, you can name it anything you’d like, but I recommend including your website URL and something like “Ghost Referral Spam Exclusion” in the title.
Before we go any further, it’s important to remember that not all bots are bad. One of the good guys is Googlebot, or other spiders that crawl and index your website according to your robots.txt file. Bots are only bad when they are used for malicious purposes.
How to Find Ghost Referral Spam in Analytics
Toggle your date range to 6 months or more (inception-to-present is even better) and navigate to the following: Reporting > Acquisition > All Traffic > Referrals. In the chart below, look at the bounce rate and session duration of the visits that exceed 10-15 and that data will be an easy indicator of fake traffic. You’ll quickly see how this fake data can severely skew your analytics metrics, especially if you’re a smaller brand that doesn’t receive millions of sessions per month.
Below is an example of real ghost spam, called “traffic2cash.xyz.” Have you seen this one in your Google Analytics profile? It resulted in 260 fake sessions with a very high bounce rate (100%) and zero time onsite. This is just one of the culprits I found on my personal website.
Why does ghost spam exist?
Malicious webmasters use ghost spam to generate traffic. They hope that you will Navigate discover the domain in your analytics dashboard and navigate to the site to see what it is.
In short, it’s virtually free traffic to a landing page that is designed strictly for lead generation.
To satisfy your curiosity without rewarding that site with undue traffic, here’s what you would find:
If you’re unsure about the domain, try a quick google search like the following: “traffic2cash.xyz + spam.” Chances are, there’s lots of documented chatter if it’s indeed ghost spam.
Step by Step Hostname Filter Setup
The easiest and most effective way to exclude future fake traffic in Google Analytics is to set up a filtered view by hostname. Other tutorials may guide you through the process of manually filtering and excluding fake referral traffic websites as you uncover them, but the Referral Exclusion List Method is not a good idea. It’s also not as effective because ghost spam websites and TLDs constantly change.
Almost all links and referrals to your website should render your hostname. One usual exception are websites like YouTube where you have given the site permission to access your Google Analytics tracking ID. If you see Google.com in the list of referring hostnames, it’s not legitimate, believe it or not.
1. Identify Hostnames
In your filtered view, select your date range as far back as you can to include all of your historical data. Then take the following path to get to the list of hostnames: Reporting > Audience > Technology > Network. Then, filter the dimension by Hostname and you should see a similar view to the one below, which is from my personal website.
Position one (1) should be your hostname. The rest of these hostnames are ghost spam as I have never configured my Google Analytics tracking ID for any of them.
Consumer Connections Newsletter
Exclusive insights, trends, and actionable brand strategy, direct to your inbox.
2. Create Your Filter Expression
Now that you’ve identified the valid hostnames, it’s time to create your expression. I promise it’s much simpler than it sounds. All you have to do is separate each valid hostname with a bar, but do not start or end with one. Here’s an example: “yourwebsite.com | youtube.com | shopify.com”
3. Add The Filter Expression
Navigate to: Admin > View > Filters > and click the “+ ADD FILTER” button. You should see the screen below. Fill out the form with a Filter Name, Custom Filter Type of Hostname, and the filter expression that you created in step 2 above. After that, all you have to do is click “Verify this filter” at the bottom to ensure that it is working as intended and then press save.
Congratulations, you did it! Now you just have to wait a few days for some new traffic and reporting data to generate. You’ll be able to tell if it’s working properly by comparing it to your historical view and seeing if all current ghost spam is being blocked in your new copied view. For the final trick, let’s exclude historical ghost spam by setting up a segment in Google Analytics to easily compare historical data to.
How to Filter Referral Spam With Segments
Click Reporting in the top navigation. Next click the “+ Add Segment” section under “All Users.” Then create a segment name (I used “Ghost Spam Nuked”), click Advanced > Conditions, and match the Filter, Sessions and Include (see below). Then, input your filter expression from step 2 above and click save.
Finally, we need to create a segment that includes all sessions for easy comparison on the Audience Overview screen. Click “+Add Segment” again and leave the checkbox “All Users” selected and click apply. Now you can navigate back to the Audience Overview. You should now see different data in your Ghost Spam Removal Segment than in your historical All Sessions data in the same graph.
Final thoughts
Google Analytics is trusted as the largest website metrics platform. You want your analytics data to tell a story, but when your data is skewed with false positives, it’s hard to read and make accurate decisions based on industry KPIs. For now, Ghost spam is here to stay and it’s going to continue to be a problem for web analysts and marketers.
Until the problem becomes unmanageable, I don’t expect a widespread fix from Google. In the meantime, it’s up to us to learn how to filter referral spam to improve our metrics.