in Marketing & Media

comScore Behavioral Study Shows Cookies Overcount Web Visitors

bio_cookie
When I worked at Media Metrix in the late nineties, and later comScore following its acquisition of Media Metrix, we executed numerous campaigns to educate clients and the broader Internet advertising industry on why panel-measurement and server-side data differed so much. Without going into gory details, there were a host of reasons, such as: tracking unique users versus unique browsers; the impact of ISP or browser caching; and focus on specific market coverage versus any visit from anywhere. Of course, a major factor was deletion of cookies, those little snippets of software that Web sites leave on your PC when you visit them, or which ad networks leave behind when serving you ads.

Today, comScore shared an important analysis — based on real, behavioral data, not self-reported — showing the validity of using cookie-based data to measure the number of unique visitors to individual Web sites and to gauge the number of unique users that were served an ad by an ad server. The study, based on an analysis of 400,000 home PC’s included in comScore’s U.S. sample during December 2006, examined both first-party and third-party cookies. Key findings:

comScore observed that 31 percent of U.S. Internet users cleared their first-party cookies during the month. Within this user segment, the study found an average of 4.7 different cookies for the site. Among the 7-percent of computers with at least 4 cookie resets, comScore counted an average of 12.5 distinct first-party cookies per computer, accounting for 35 percent of all cookies observed in the analysis.

Using the total comScore sample as a basis, an average of 2.5 distinct first-party cookies were observed per computer for the site being examined. This indicates that Web site server logs that count unique cookies to measure unique visitors are likely to be exaggerating the size of the site’s audience by a factor as high as 2.5, or an overstatement of 150 percent.

The news release is here. As Fred Wilson correctly noted: “It’s true that panel data is generally a lot lower than your own server logs. But that doesn’t mean your server logs are right.”