Elephants and Analytics

"Elephant in the corner" is an English idiom for an obvious truth that is being ignored or goes unaddressed.
  • rss
  • Home
  • Posts
  • About me

So you dare to compare…

Tim Elleston | August 31, 2009

If you’re thinking of trying to compare web analytics results by vendor, or even by the original log file, you’re in a for a very tough time.  Just thinking about it is a mistake.

Genie + bottle + uncork = Long time, not good time

The problem is that while there are basically two different methods of data collection (via server logfiles or via JavaScript tags), the variables associated with both, in a real world environment, make it almost impossible to compare results.

And therein lies your genie out of bottle, and you scrabbling around trying to justify the results.  You’re best off not even trying.

Server Logfiles

Every web server stores requests made of it to a log file (providing logging is turned on).  Programs like AWStats and many others out there, read through all of the entries, sorting out all of the requests into general reports, such as page views, visitors, visits etc.  They can also generally weed out spiders from search engines, so you get a little closer to unique “human” counts.

But there still remains a big problem with this method and that is “caching”.  If a person re-visits a page, quite often the second and subsequent request will actually be served from the browser memory, or cache, and no request will go back to the web server for logging in the file.  Secondly, with the advent of network caching servers, the same thing occurs.  So, you end up with under-counts.

JavaScript tagging

Enter JS tagging.  Small bits of JavaScript are added to the page, typically within the body section.  After the page loads, the JS executes a request to the vendor logging server, passing information to it, such as page name, time and date, information about the visitor etc.  Nowadays, a cookie is also generally set, to spot repeat visitors.

The benefit of this method is that it typically defeats caching as each time the code is executed a new request is made to the vendor.  The other benefit is that other information can also be set and sent, such as custom events, campaign codes etc, which is very important to enhance the overall and customization capability.

The downside is that they generally require the user to have JavaScript enabled (which most do) and, the accuracy is determined by the location of the code and if the code executes in a timely manner, before the user has clicked to another page.

Genie, bottle, out

Imagine trying to compare fuel consumption on two similar vehicles over an identical distance.  While the vehicles might be identical, the way the vehicle is driven, the amount of traffic, the number of times you stop at lights, the temperature and humidity etc, all affect fuel efficiency.

You get a similar challenge when comparing vendor results, the variable in this case though are the way the user interacts with your site.  And unfortunately, the disparity in results often leads to dissatisfaction with the vendors solution, rather than an understanding of how the differences can occur.

Let’s say you use both Google Analytics and Omniture and Web Server Log Files.  If you browse to a page and let it fully load, you’ll no doubt have the same counts across all three.  1 visit, 1 page view.  The log file will also show all of the other associated requests, such as CSS, JS, images and the likes.  But this is not a real world comparison.

The problems start to creep in when users start to browse your pages.

Some will click links before the page has completely loaded (JS will not record activity).  Some may have JS turned off (JS will not record activity).  Some may be with an ISP that utilizes caching servers (log files won’t record activity).  If Google JS is at the top of the page and Omniture is at the bottom of the page, Google may record a page view, but if the user doesn’t let the page fully load, Omniture possibly  won’t record the page view.  First party and third party cookies are based on user settings and may not be set, affecting visitor counts.  And the differences continue.

But over the years, vendors have strived to get as close to the truth as possible, using very sophisticated JavaScript – and they have to; it’s a multi-million, if not billion, dollar industry, which continues to grow as site owners demand more flexibility and insight into user behaviour.

If you are looking for a solution where you can customize your insight, then the JS tagging option is your best bet, as it provides more flexibility.  You can track clicks that wouldn’t be recorded by a server log file; you can track flash interactions; you can track campaign activity, you can track shopping cart volumes and revenue and products.  And, with certain providers such as Omniture, you can target content based on the user’s previous activity across your site, using products like Omniture Test and Target (saving that for another post).

So, the thing to do is to try not let that genie out of the bottle, otherwise you’ll spend the rest of your time trying to put it back in.

At best, you should expect (and explain to your stakeholders) that differences will occur and provide a rationale for the differences.  As long as you’ve implemented your tagging correctly, you should be pretty close to the truth.

And remember, web analytics is not about the absolute numbers…

Comments
2 Comments »
Categories
Basic metrics, SiteCatalyst, Strategies, Testing
Tags
comparing
Comments rss Comments rss
Trackback Trackback

People who liked this…part 2

Tim Elleston | August 15, 2009

In my previous post on People who liked this, also liked…, I put forward an idea how to generate “related” products of interest, based on what users were looking at, which could then be automated and re-published back to a site, based on Omniture data.

We implemented this in our s_code, largely as a test, to see what the results would be like, and to further enable proof of concept testing when we exported the data.

course_activityWell, we’ve had an interesting observation as a result of this (we haven’t yet fully implemented the re-publishing part) and I thought it was worthy of a quick posting.

Our expectation of user behaviour was that they look at multiple courses, during their first visit – in essence, they would browse courses looking at what was on offer, reading about the various differences between them.

It turns out that is not the case.

If we look at Course views over a certain period we can see what are the most popular courses (and not so popular).  What I have shown here is a list of the top 15 courses viewed over a specific time frame.  So we can see from this that Vet Science was the most viewed course over time, but it doesn’t show whether this was the “first choice” or the “second choice”.  It doesn’t show whether this was viewed during the initial visit or a follow up visit.

So, this doesn’t help in our People who liked this, also liked…scenario.

course_stack_omnitureIn putting in place our Course Stacking code, we are now able to see the order in which courses are typically viewed in.

Bear in mind that we have implemented this as a solution that we can export and then re-integrate into our site, so we have purposely used Course ID, not the name of the course.

The result within SiteCatalyst, is that the report isn’t easy to read…but I’ll attempt to explain it.

The Course View Stacking column contains the Course ID’s in the order that they were viewed.  So for example, the first entry “39>146″ tells us that there were 133 instances where a user started with viewing course 39, then went on to view course 146.

Utilizing Course Stacking, there are an enormous amount of combinations that will ultimately show up in this report.  Just in the short time this has been going, it has already generated 6,275 combinations.

Of course, this could be other products, product categories etc.

So we can get these into perspective, and not use just ID’s, Course 39 is Law (Four-Year Degree) (LLB), and Course 146 is Law (Three-Year Degree) Juris Doctor (LLB).

So, what we’re seeing is that there have been 133 instances of visitors starting with the four year Law degree, then going on to look at the three year Law degree.

A bit further down the report in position 9, you’ll also see “39>146>39″.  This is a different stream to the above.  In this stream, there have been 47 instances where visitors have gone from the 4yr degree, to the 3yr degree and back to the 4yr degree – which is different to the above, where they went from the 4yr to the 3yr, and have not looked at anything else since.

This is the reason why we can get so many combinations.

Ok, so overall, that seems to make sense – users would look at other courses.

Notice the percentage (of overall) is very low.  This activity only occurred 0.5% of the time.  This is due to the amount of time we’ve had this running.

However, remember our original assumption – users will look for a course, then browse to other courses, during their first visit.

Out of curiosity I exported the data from Omniture, segmented by New vs Repeat visitors, so see whether the behaviour changed.  The question I wanted to answer was “do users typically look at more than one course [product] during their first visit”.

I was surprised by the result.

course_stacking

I actually used a different product to analyze the results here.  The above chart shows the number of different courses viewed (x-axis) and the number of times that occurred.  New visitors are blue, repeat visitors are orange.

It’s split roughly 50/50 for people who view 1 course, but it’s certainly the largest category, inferring that the majority of visitors come to the site and are engaged with one course, which is also good news, suggesting that they are satisfied with the content of the course.

Now look at the weightings for those who view more than one course.  There are clearly more repeat visitors who view 3 or more courses, and that really spikes at the end (10 or more courses are grouped together).

This suggests that our original assumption was wrong.  First time users, on balance, view only one course.  But when they come back, they view either one course, or more than one course – suggesting that they begin to browse courses only on their repeat visit.

This is very interesting and insightful.  We can use this to our advantage.  We can target content to repeat visitors who have viewed a course previously – either prompting them back into their original course, or present them with “related” courses.

We can also try to better cross-promote courses on their first visit.

This was an unexpected insight that came out of this analysis, but to us, very valuable information that can be used.

Comments
No Comments »
Categories
Data warehouse, Segmentation, SiteCatalyst, Strategies, Test and Target, Testing
Tags
campaign stacking, new vs repeat, Segmentation, targeting content, Testing, value, visitors
Comments rss Comments rss
Trackback Trackback

Automate your tag clouds with Omniture

Tim Elleston | August 9, 2009

One of the nice things about Omniture is the ability to export information out to other systems.  We use this feature to generate tag clouds on our site, based on the most popular courses viewed over the last 30 days, segmented for different audiences.

In order to do this, there are a few things that need to be done first.

Firstly, we report course views as products, passing a shortened name of the course from our content management system and database, to the s.products variable, such as:

s.products = “;Marketing-and-the-Media”;
s.events = “prodView,event5″;

We have set up event5 as a success event, signifying a course view.

As we have multiple pages associated to a course, we make sure that we only pass the s.products and s.events values once per course view, irrespective of the page within the course a user is looking at.  This is done by using some custom code within our s_code file.

In SiteCatalyst, we then use SAINT classifications to generate Course-based reports, associated to schools, faculties, type of course (undergrad or postgrad) etc.  This allows us to get in-depth information on our course activity, along with conversions etc.

Audience segmentation

segment_builderA common reporting segmentation for us is to compare Australian traffic to International traffic, so we have created two segments, using the segment builder.

The Australian segment includes any visit where the GeoCountry was Australia.  The International segment includes any visit where the GeoCountry was not Australia.

Exporting the data

dw_reportWe then use DataWarehouse to create two reports, based on the last 30 days of activity.  Each report uses the segment defined above, with the Course name and the number of Product views (as we use the product variable to set course views).

These two reports are scheduled on a daily basis to export the data to our FTP servers as a CSV file.

Once we have the files, we import the data into a database along with the date of the file, so we can use that later.

Now we have the last 30 days of activity, by each course, by traffic source as a dataset that we can use on our site.  It’s then a fairly straightforward process to match the course name with the URL of the actual course, so it can be used as the link on the tag cloud.

The end result

Each day we query the database and using standard tag cloud calculations, we are then able to re-produce the data back out onto our site.  We currently feed the data back out as an XML file which is read by our Course Browser flash tool – showing both a Domestic and an International view of the most popular courses.

tagclouds

We’re also working on something similar for internal search terms, which will be used to populate a “search as you type” functionality on our search forms, but it will be segmented by audience type – Staff, Student or Anonymous (being general traffic).  That one is a little tougher, because we have to associate the most common destination clicked on, with the searched-for term.  But more on that in a later posting, once we have it working.

So, using a combination of Omniture SiteCatalyst, DataWarehouse and segmentation, we’re able to easily offer our users with quick navigation methods to various pieces of content, thereby enhancing their user journey.

Comments
No Comments »
Categories
Data warehouse, Segmentation, SiteCatalyst, Strategies
Tags
Data warehouse, Omniture, Segmentation, tag clouds
Comments rss Comments rss
Trackback Trackback

Bookmark and Share

Join the elephants email list

Sign up to receive emails about new posts



* = required field
unsubscribe from list

powered by MailChimp!

Search

Analytics

  • Omniture
  • Omniture Blogs

General Links

  • Murdoch University

Recent Posts

  • Measuring conversions
  • Strangely, they’ve asked me to present again…
  • More internal search insights
  • Page success events and eVars
  • Campaign bounce rates and pathing

Categories

  • Basic metrics (8)
  • Behavioral targeting (1)
  • Campaigns (4)
  • Conversions (6)
  • Data warehouse (3)
  • Discover (1)
  • Search (4)
  • Segmentation (6)
  • SiteCatalyst (8)
  • Strategies (13)
  • Test and Target (3)
  • Testing (2)

Tags

ADMA 2010 basic metrics behavioural targeting bounce bounce rate campaign stacking comparing conferences content relevance Conversions Data warehouse Digital Day fundamental metrics implementation internal search keywords KPI's long pages Marketing Higher Education Symposium new vs repeat Omniture Omniture Discover page views participation pathing percent viewed Search Segmentation strategy tag clouds targeting content Test and Target Testing time on page time on site value visitors visits web analytics strategy

Archives

  • August 2010
  • July 2010
  • June 2010
  • April 2010
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
rss Comments rss