One of the questions I continually get asked to address for people is, "Why do the stats from provider A not match the stats from provider B?" Often this is when comparing Woopra to another analytics provider like Google.
I have touched on this topic in the past a couple of times, but we're going to expand on it further today:
- Woopra vs. The Competition (Technically Speaking)
- Q&A: Why Does XYZCo’s LIVE Visitor Number Differ From Woopra?
Different Tracking Mechanisms
- User names, custom data (customer purchases, comments, etc.), outgoing click data, and more
- Information appended by a service provider (like Woopra) such as business name associated with visitor IP, etc.
- Log files are susceptible to inaccuracy due to caching and proxy servers, and also because they record visits from bots (like Google, Yahoo or Spam-bots) as if they were human.
- Processing of log files can take many hours, and puts strain on the server doing the analysis. (Not good if it's your Web server.)
- Log files can eat up gigabytes of space on server hard drives, and the constant collection of data uses more power and consumes your machine's resources.
Frankly, my recommendation is to turn off your servers log files to save resources because 99% of people will never, ever use them anyway.
- As with log file analysis, collecting stats on your own server will put tremendous load on it. There will be a noticeable decrease in your servers capacity, especially under heavy load. If you have to upgrade your machine to maintain performance, it's not really free.
- Any sort of high traffic site effect such as Digg, Slashdot or Lifehacker will almost certainly crash a server trying to serve pages and track every visitor.
- There are long term storage and archiving issues associated with stats data. You must be prepared to deal with them for years if you wish to maintain historical data. This is an unenviable task.
- These systems are far less sophisticated and collect less information than a service provider can.
The first Web Analytics provider architectures relied on simple image downloads to increment a counter. It worked quite well in the early days, though there are significant drawbacks now:
- Images can be subject to caching which would miss repeat visits, or sometimes even new visitors coming from an ISP who has cached the image to save bandwidth.
To Sum Things Up
There is absolutely no perfectly accurate system; however, at Woopra, our system of dedicated servers in Tier 1 data centers with premium backbone connectivity does all of the heavy lifting for our clients. We keep the servers loaded to well within their limitations, ensuring they respond rapidly when collecting visitor data (meaning fewer dropped visits and faster page loading) and keeping the load off your server.
Finally, unlike other service methodologies, Woopra continually receives a ping from visitors on your site (because Woopra is a LIVE and real-time service provider), enabling us to be 100% certain that visitors are actually still on your site. We believe that all of this adds up to the most accurate user reporting in the business.