Back to All

Using the Hostname Filter for Accurate Google Analytics Data

Updated:

Recently, I’ve come across an issue with my Google Analytics. Traffic numbers seemed unrealistically high, and there was one major red flag as to why.

What was the red flag? Google Analytics hadn’t even been installed on my site yet.

This isn’t an uncommon problem, even if the situation might not be exactly the same. It could be higher-than-usual traffic numbers, non-existent landing pages, etc.

So how can you make sure you’re getting an accurate view of your data in Google Analytics? This post will walk you through some of the things I was seeing, as well as explain how to make sure you’re getting an accurate reflection of your data in Google Analytics using the hostname filter.

The Issue

I already knew one thing for sure: any traffic in Google Analytics wasn’t coming from my site. And if it is, my problem seems to be a super-friendly development team with telepathic superpowers.

Since I don’t have an ever-so-convenient, telepathic developer at my disposal, I knew this wasn’t the case. I did know, however, that Google wasn’t lying to me; somehow my Google Analytics tracking code was registering visits, even without my tracking code installed anywhere on my website.

To reiterate, here’s what we knew:

  1. Google Analytics was registering visits and gathering data.
  2. Google Analytics was not installed on my website.

So how could Google Analytics register visits to my site without the code being installed? Because it’s not my site traffic it’s gathering. Google Analytics – whether you’re copying the code or using the analytics.js file in your code – doesn’t stop working when it’s on another site; it only accounts for incoming traffic. You may definitely set a default site (or hostname), but Google Analytics is capable of collecting data outside of this host.

This applies to any site. Regardless of your red flag (uninstalled tracking code, a landing page you know doesn’t exist on your domain, etc.), your tracking code is capable of returning data from another domain and only registering the relative URI as the landing page.

Checking for Traffic from Other Hosts

It’s actually quite simple to see if your data is actually a result of another host’s traffic if you know where to look.

Once you’re logged into Google Analytics, you’ll want to view all site traffic. To do this, navigate to Site Content – All Pages found in the Behavior section of your dashboard. Google Analytics will provide you with the top pages (by pageviews) by default, but this is something we’ll need to change. We’re not interested in the individual pages – we need to learn more about the domain host.

To sort our data by the host, we’ll need to set the primary dimension to hostname. Hostname, in a nutshell, is the domain your data is coming from. Hostname can be found within the Behavior category of the Other dropdown.

And voilà! Here are all of the different domains (or websites) providing data to your Google Analytics property.

If your hunch was right, you should be seeing a list of sites that you don’t recognize. These are the different domains skewing your actual data.

Important note: there are likely going to be some valid domains in this list. Google domains – such as webcache or translate – are valid if searchers are looking at the translated or cached versions of your site. You might also see other web properties you might own – subdomains or another domain – that are completely valid if you want an aggregate look at your analytics for all web properties.

How to Take Back Your Data using Hostname

Fortunately, there are a few solutions to make sure you’re getting a complete, accurate view of your website data.

Even more fortunate, neither one of the options below are reliant on contacting any host webmasters to remove your tracking property from their sites.

Create an Audience Segment based on Hostname

To filter out any data not coming from our website, we’ll need to create an audience segment that looks at the same condition as our identifying issue: the hostname.

If you aren’t familiar with creating a segment in analytics, you’ll start by clicking the + Add Segment button on your dashboard, followed by the + New Segment button the upper left of your newly opened popup.

From here, Google provides several basic, quick-and-easy options for segmenting your incoming traffic. Our soon-to-be created segment requires a bit more hands-on work, so we’ll need to create the segment based on an Advanced Condition.



Again, we’ll only want to see data associated with our own website, so we’ll segment our audience based on the hostname dimension, which can be found listed in Behavior.

We have a few options from here: we can do an exact match of our hostname, but I think it’s more effective to write a regular expression to ensure we’ll still collect all the data we need.

For example, for our website, I’d write the following expression:

(www\.)?fourfront\.us

This expression would capture any traffic using www.fourfront.us OR fourfront.us as a hostname. And as a shameless plug, our own Tom Boland wrote an excellent Beginner’s guide to RegEx in SEO to help anyone unfamiliar with regular expressions, their uses, and/or how to write them.

Assuming we’ve written a correct regular expression, we should be getting an accurate read on our data based on our newly-created Audience Segment.

Learn more about Creating a New Segment in Google Analytics

Create a New Hostname View to Fiture Future Incoming Traffic

Now that we’ve segmented out our current traffic in our current, we can create a new view to filter future traffic.

Creating a new view for future incoming traffic starts in the Admin dropdown of your Analytics property. Once on the New Reporting View page, you’ll name your view and set your Reporting Time Zone.

Congratulations! We’ve just created a new view. But we’re still not done – we need to filter incoming traffic within the view. Within the View section of the Admin tab of your property, you’ll find Filters. You can set up multiple filters for your traffic based on a variety of different factors.

Set a name for your filter, and once again, we’ll filter traffic based on our hostname. Hostname is a predefined condition, so you can set your filter with the following:

Include only | traffic to the hostname | that contain | (non-www. domain.com)

Important note: creating a new view will not apply retroactively. Future traffic will be filtered to only include traffic coming from your hostname, but your view will not have any effect on current data.

Learn more about Adding a New View in Google Analytics

Conclusion

Once you’re able to determine whether your analytics has been taken over (or at the very least, skewing your data), it’s a fairly simple fix. The most difficult part of the process is knowing that your data doesn’t look quite right. It takes a keen, diligent eye to notice potential anomalies, so make sure you stay on top of your web analytics.

Have you had a similar experience with your Google Analytics? Know of another solution? We’d love to hear from you!

Was this article helpful?

9 Comments:

Doug

Hi there just curious if something was missing here

For example, for our website, I’d write the following expression:

(www./)?fourfront./us

Jared Groff

Hi Doug - I don’t believe there’s anything missing in this RegEx. If there’s something I am missing, please let me know!

Ellie

Thanks Jared & team, this article has been really helpful. What is a good work around for needing an include only hostname filter, but your regex contains more than 255 characters?
Example:  I have this issue where many more hostnames than are mine are in my account and I need to create a filter to only allow my hostnames into my google analytics account. There are vastly more domains to exclude than include in my situation. However my include hostnames regex is larger than 255 characters which is the max allowed. Would two include only hostname filters applied to the same views work together (like filter a includes fish, cats, and dogs. and filter b includes tigers, bears, and lions; would a view using both filters contain hostnames fish, cats, dogs, tigers, bears, and lions.com)?

Merwede vd Merwe

Good day

Would you please be so kind to help me My hostnames expression below doesn’t want to work and I had run out of time with my yearly report My incorrect expression is as follow:
regalsecurity\.co\.za|regalexpress\.co\.za
Thank you very much
Kind Regards
Merwede

Philip

Merwede you need to use a pipe | for the OR command, as in http://www.regalsecurity.co OR www.regalexpress.co.za

regalsecurity\.co | za|regalexpress\.co\.za

Christopher

Curious on the need to include both WWW and non-WWW within the filter, (www\.)?fourfront\.us… is this necessary when only a single version resolves?

josefin

How would this work with cross domain tracking? Which hostname should I use in the filter?

sergey

thanks, this was quite helpful, although a bit outdated…

Vicky Tosh-Morelli

Very helpful post. Just ran into this on a client’s site and wanted to verify that I was setting up the hostname filter correctly. Trying not to harbor ill will on the developer who put the wrong GA tag on their clients site and are wondering why there is no data…

Add a Comment

* Required Fields
background background
background background background
background background background background