Analyze This! Lesson Two: Issues Affecting Accurate Reporting

7 comments

column-logo

In my last post, I talked about the different types of data collecting methods for web metrics and the disadvantages and advantages of each. In this week’s post, we are going to discuss issues affecting data collecting accuracy.

The first involves cookies. Everyone’s heard of cookies (yes, the non-edible kind, sorry, Mrs. Fields). Cookies are small text messages that a web server transmits to a web browser so that it can keep track of a visitor’s activity (typically anonymously). There are two types of cookies:

First party: Cookies that are provided by a website to a web browser and user. Examples are pages with logins like MySpace.

Third party: Cookies that are set by third-party sites. Websites that may provide you with third-party cookies include pop ups, banner ads, web advertising and images hosted on other servers, etc.

Google Analytics only uses first-party cookies. This is good for two main reasons:

1. Google keeps private user information private and has strict guidelines for the privacy policy for Google Analytics. This ensures that visitor data is kept private and secure.

2. Because Google Analytics only uses first-party cookies, it passes through most spyware applications and personal firewalls as a trusted source, allowing for more accurate tracking.

However, there are several issues affecting accurate data collection when cookies are used. These issues are the reasons behind the less-than-100% accuracy you may see when working with Google Analytics.

1. Cookie deletion: An increasing number of Internet users are becoming “cookie deletion savvy” according to a comScore survey. Why is this important? When cookies are deleted, important session information, user data, and other information is lost, making it impossible for that user to be tracked.

2. Latency: When I talk about “latency,” I mean the time it takes for a web visitor to complete a goal or action on your site. If the visitor deleted a cookie and then decided to come back after the cookie expired, they would be considered a new visitor and not attributed back to the appropriate channel upon conversion.

3. Multiple users: Typically, there are several computers in homes today. Several people could be accessing one computer or one person could be accessing many computers in the home or at work. All of these possible combinations may result in inaccuracies in reporting.

4. Offline conversions: Still one of the missing links within web metrics — tracking offline conversions once a visitor has browsed your online catalog — can lead to inaccuracies with reporting.

If the cookie problem wasn’t trouble enough, here are some other major issues affecting accuracy.

1. Page tags not installed correctly: Unfortunately, it is EXTREMELY normal to find sites implementing Google Analytics (or any other page tag system) that have not fully tagged every page of their site, causing reporting inaccuracies.

2. JavaScript loading: People viewing a web page who do not allow the page to load all the way will not trigger the JavaScript file that runs the GA.js file from Google.

3. Downloadable files: It is common to find that PDF files, Word documents, and other forms of downloadable material have not been set up using virtual pageviews. This means they will track as a click, but would not be tracked if the download was initiated, downloaded, or canceled.

4. Fakers and spoofers: More uncommon is page hijackers. People who copy design code can copy over your page tags and that data would be present in your analytic reports.

5. Product returns: This is a very difficult thing to track back into your analytics reports. You may have goals and goal values that show the number of sales and incoming revenue, but in-store returns may make your analytics reports inaccurate.

After reading this blog post, it is difficult not to get discouraged, but it is vital that you keep these issues in mind when evaluating your analytic reports. Your understanding of how they affect analytics will be a key to success when you are called to the office of your CMO or CFO, who is wondering why his numbers are not matching up.

What issues have YOU found that have lead to inaccurate web metric readings?

About the Author

Joe Whyte has been developing, managing and implementing successful, innovative, bleeding edge digital marketing strategies for Fortune 500 companies for over 7 years.

Add Your Comments

  • (will not be published)

7 Comments

  1. Mia

    Thanks again for the great information! I'm glad you wrote about these issues as some of them I had no idea about!

  2. Thank you, The explanation was excellent. Thank you so much.

  3. Matt

    Thanks for the info Joe -- great stuff.

  4. Dave Graham

    With DNSSEC becoming more and more available this should cut down on the amount of fakers and spoofers. With accurate accountability from the source location with "finger printing" as with SSHFP. HTTPFP and HTTPSFP should help. IPv6 should also help with collecting data from users as well. Though it's not widely used (but should be) it is becoming more and more availble as well. With operating systems such as Vista which use link-local it does make it a bit more difficult to get accurate reading as well. Also with the help of applications that track people (like WebTrends) coming in from the Internet should also help analyze the frequent visits from a location (NAT included which hinders some). I've stated many options, in not major detail, but the future of analysis could make it easier to track these types of things.

  5. Great post Joe.... ...i think the biggest discrepancy we face is due to cookie deletion. You have given us a lot to think about here. Keep the great posts coming! Michael B.

  6. Thanks Joe, This is definitely an interesting read. While I am not passionate about the technical details of analytics. I am passionate about trying to make good decisions based on this data and you have definitely given me something to think about. Andrew

  7. We’re a fairly IT led company when it comes to reporting, and experiencing difficulty with management buy-in of GA goal data (we’ve been running GA for 2 years now). When someone completes a goal, e.g. website registration, what % tolerance would you say is acceptable comparing to our CRM database? We're finding monthly differences sometimes over 20%. How accurate should GA be? Is GA any more/less accurate than other tools? I'd be really interested to learn about other people's experiences here! Justin