Elephants and Analytics

"Elephant in the corner" is an English idiom for an obvious truth that is being ignored or goes unaddressed.
  • RSS
  • Home
  • Posts
  • Archives
  • About me
  • Suggest a topic
  • Consulting

The Google sampling effect

telleston | March 5, 2012

We’re just working with a client that uses Google Analytics for their measurement platform and as part of our overall strategic plan, we’ve done some supporting analysis for them, reviewing the last year with respect to a specific segment or two.

They get a surprisingly large amount of traffic for a WA business, which is especially significant when it comes to using Google Analytics.  On average, they get about 25 million page views per month – so, over the course of the year, that’s somewhere around 300 million page views.

During our supporting analysis, Google announced a new feature – controlling the sampling rate.  According to their blog post:

Control your report calculationgoogle_slider
One way we speed up the serving of data is through what we call “fast-access mode”, which applies to reports generated from large data sets. In the coming weeks, we will be peeling back the curtain on how “fast-access mode” works and letting you control the number of visits used to calculate reports.

Out with the old: fast-access mode
If a report requires calculation on more than 250,000 visits, we select a statistically random sample of 250,000 visits and estimate the report results based on that data. This makes reports faster to load, and our testing indicates that the data returned is highly accurate.

In with the new: control your report calculation
Now you will have the ability to control the number of visits used to calculate your reports, and we inform you of exactly how many visits are used in report calculation.

Out of interest on this new feature, I decided to play around with the settings and look at differences in the reports and conversions, using different sampling rates, and the ultimate reason for the improvement being speed – and I was quite surprised by the results.

Using their slider, I changed the sampling effect from the default, which is in the middle, moving it towards the right, or Higher Precision.  Each time, I switched over to the conversion reports to see what the results were.

On the default setting, the Paid Non Branded search conversion rate was 10.32%, but at the extreme right, it was 4.95%.  Interestingly, while Paid Search conversion deteriorated, Organic Non Branded improved.

Likewise, the other conversion rates all “tightened up” as I progressed from the default to the higher precision.google_sampling_effect

Notice the size of the sample set though – from its default it’s using 2.6% of visits; at higher precision it’s still only using 5.15% of visits (500,000 our of 9.7 million) – and it was this that caused me some concern.

When I pushed it all the way to left, Faster Processing, the results were useless.  It was based on <0.01% of traffic; there were no conversions for Organic Non Branded, or Paid (both Org and NB).  So, while super quick, super inaccurate.

What are the real rates?

If we’re seeing such a variance in conversion rates from 2.6% to 5.15% of overall traffic, then I wonder what the actual conversion rates are, without sampling applied?  Unfortunately, we have no way of knowing, as far as I can tell.

While sampling certainly helps Google with returning results quickly, I think they should allow the opportunity to see what “actual” is, so that at the very least you can determine your margin of error across any report.  There’s absolutely no point in having really quick reports, when the results are misleading.

In the above conversion report, Paid Non Branded converts at 10% using the default setting, whereas in fact it’s less than half of that rate.

I would certainly recommend that if you are using Google Analytics, you look at the impact being applied to your reports through this sampling effect, especially if you are optimising conversions.

Related Posts

  • How’s your measurement footprint?
  • How to create a good measurement strategy
  • Perth Think Tank presentation
  • Strangely, they’ve asked me to present again…
  • Who’d have thought Einstein was into Web Optimisation
  • Adobe buys Omniture
  • Marketing Higher Education Symposium 2009
  • So you dare to compare…
  • People who liked this, also liked…
Tim Elleston is Director at Digital Balance. You can follow him on Twitter at http://twitter.com/timelleston. Please feel free to use the comment facility below.
Categories
Strategies
Tags
GA sampling, google analytics sampling
Comments rss
Comments rss
Trackback
Trackback

« Time spent by Traffic Source Discover v3 – it’s the new black »

5 responses

Ben Gaines | March 5, 2012

Great post, Tim! Thanks for sharing these insights. My GA account doesn’t get enough traffic — or have enough thought put into the implementation — to yield anything interesting like this.

Ultik | March 29, 2012

This is why you can get Enterprise tools like Webtrends where there will never be any data sampling in the UI no matter how much data you look at.

This was a great post, thanks alot!!

Ulrik | March 30, 2012

Another key point here is that Google will Not guarantee over 10 million hits per month analyzed. Do you have any experience what effect that has on the numbers, when going above that?

Rich McPharlin | April 18, 2012

We are in a similar predicament, but the client is serving 180M page views per month. Currently we are able to beat the Google sampling by creating custom reports with different sets of filters applied, however there is no way to create an un-sampled day-part conversion rate report no a value/click day-part report, which is severely hampering our further optimisation efforts of the paid search channel. Does anyone have experience with Analytics premium and if one were to pay for it would the un-sampled reports apply to historical data?

Ulrik | April 24, 2012

Rich,

I might be biased towards Enterprise tools, using them myself in daily analytics, but you might consider this:

When you reach a threshold of traffic where sampling starts interfering with your data so that you start making bad decisions on the insights you get, it might be time to consider a switch to an enterprise friendly tool. It is not free, but you have to realise that your current tool is not free either, you are handing over all of your data as the cost. And that is to an advertising agency, who has all of your competitors as customers. When it is free, you are the product.

So in your shoes, this is what I would do:

Make a demand specification. Invite the makor players: Omniture, Webtrends, Core Metrics (if they are in your country), and Google Premium. See how well they can serve your needs and pick the one you feel most comfortable with.

Google Premium allows you to order unsampled reports, which you then download. It costs 150k$ per year.
Webtrends has every single report in unsampled format.
Omniture, I am sure someone from this site can tell you more about.

My point is this. If you have 150M page views on your website a year, you get get a hell of a lot of Entreprise tool for the same price Google Premium charges. Youll be able to get Facebook analytics, Mobile app analyticvs, YouTube analytics, and maybe even state of the art A/B MVT tool with behavioral targeting on top of that.

But, I dont mean to bad mouth GA Premium. I am just saying, that if you are considering a non free tool. Do a demand spec and pick the one that fits your needs best.

Leave a comment

You can use these tags : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Adobe Certified Expert - Omniture Implementation
Adobe Customer Advisory Board

Come and see us…

Come take a look at what we're up to at digital balance

Join the elephants email list

Sign up to receive emails about new posts



* = required field
unsubscribe from list

powered by MailChimp!

Suggest a topic

If you'd like me to write about something specific, let me know

Search

Analytics

  • Brightcove
  • Omniture
  • Omniture Blogs
  • The Omni Man Blog
  • WebAnalyticsLand

General Links

  • Murdoch University

Recent Posts

  • Discover v3 – it’s the new black
  • The Google sampling effect
  • Time spent by Traffic Source
  • Flowplayer and SiteCatalyst v15
  • Test&Target versus Google Website Optimizer

Categories

  • Basic metrics (3)
  • Discover (5)
  • SAINT (2)
  • Search&Promote (3)
  • SiteCatalyst (33)
  • Strategies (10)
  • Test&Target (3)

Tags

basic metrics behavioural targeting bounce rate Brightcove campaigns campaign stacking content relevance Conversions Data warehouse Discover engagement evars fundamental metrics getPreviousValue plugin implementation internal search keywords measurement strategy measuring engagement Omniture optimisation optimization page views pathing props saint Search Segmentation seo SiteCatalyst Strategies strategy targeting content Test&Target Testing time on site value video visitor engagement visitor ID visitor interaction visitors visitor scoring visits web analytics strategy

RSS Our thoughts at Digital Balance

  • Has Google shot themselves in the foot?
  • Action is the antidote to fear
  • What is it that makes a good digital team great?
  • What to do when inspiration doesn’t strike
  • Is your kitchen humming along?
  • I didn’t listen to my own advice
  • I didn’t mean to get distracted
  • 5 simple steps to make it count
  • how to make sure everyone is created equal in an agency relationship
  • It took a rush of Coldplay’s blood to the head get me started
rss