What is the definition of “your data”? The answer may determine the future of the Internet – and, more broadly, of communications media, the users that derive value from them, and the marketers that depend on them.
The combination of the word “data” or “information” with a personal possessive pronoun lies at the heart of the current debate over interactive advertising and privacy. In the Monday New York Times story “Web Privacy on the Radar in Congress,” reporter Stephanie Clifford wrote that a subject of her piece knows that companies “are collecting his data.” The Center for Democracy and Technology, the prominent Washington-based proponent of a Federally mandated “do not track list” against interactive advertising, told the Los Angeles Times recently that Americans are “uncomfortable” with “the collection of their data.” The Federal Trade Commission, in proposing principles to control “behavioral advertising,” recommends that “consumers can choose whether or not to have their information collected for such purpose.” Democratic Congressman Edward J. Markey of Massachusetts said yesterday that he expects to introduce legislation during the coming year that “includes a set of legal guarantees that consumers have with respect to their information."
All well and good, you might say: My identity must be protected from thieves and exploiters. But guess what? The plans that these activists and their enablers are promoting have nothing to do with identity protection. To the contrary, they are agitating – some, perhaps, unwittingly -- for a new property right, unique in U.S. law, that would provide consumers personal ownership of all information that derives from their activities, no matter how anonymous, non-identifying, aggregated, or otherwise impersonal it may be. They are further proposing that the Government, as the codifier and protector of such rights, use this definition of “behavioral data” to assert Federal control over most Internet operations. The effect could be to cripple the architecture of the World Wide Web.
Oh, Behave
Although
this effort to socialize the Web is taking place in plain sight, it
involves no small degree of artifice. Rarely if ever, for example, are
such phrases as “his data” or “their information” explicated – leaving
readers to believe that sensitive personal records are being
compromised. But on occasion, the activists slip in telling ways. Jeff Chester, the proprietor of the extremist Center for Digital Democracy
and a frequent witness in regulatory hearings in Washington, has made
clear his belief that no distinction exists between identifying data
and impersonal data: If it can be used in any way for marketing
purposes, it belongs to the individual, and Government should restrict
its application. As he wrote on his blog last April, interactive
publishers
…
know that in today’s digital marketing era, the very tiny bits of
personal behavior they have identified are parts of individual human
identity. Our ‘virtual’ identities may be composed of discrete and
disassembled bits of information about ourselves: what we like to read,
watch, buy; our problems and concerns (such as health or our children’s
education) or our political interests, but they are very much living
aspects of ourselves. The goal of interactive marketing is to collect,
analyze, and use such information to serve the interests of those
paying for the targeting. The technique uses one, two or multiple
individual data points in a variety of ways (search ads, broadband
videos, virtual worlds) to get individual consumers to behave or act in
ways that favor or reflect the marketer’s goals.
State legislatures –the stalking horses for the Washington lobbyists and legislators looking to constrain marketing and media – have followed the lead of Mr. Chester and his “regulate-the-Web” comrades. A New York State bill
that was written to restrict what it termed “online preference
marketing” actually promises explicitly to extend Government control
over virtually all consumer research that has a Web component. The bill
(sponsored, mystifyingly, by an Assemblyman from Westchester County,
home to such consumer marketing and media giants as PepsiCo, the
Readers Digest Association, IBM, and Starwood Hotels & Resorts),
defines “online preference marketing” as “a process used by entities
whereby data is typically collected over time and across Web pages to
determine or predict consumer characteristics or preference for use in
ad delivery, including the use of non-personally identifiable information.”
These
definitions and metaphysical disquisitions help us understand how
breathtakingly and unprecedentedly broad the supposedly protective
proposals to restrict “behavioral targeting” actually are. They
unambiguously define “behavior” as any and all consumption activity, no
matter how distanced it is from one’s personal identity. Equally
plainly, they say that any “data” or “information” that derives from
such behavior would fall under their proposed regulatory scheme, even if it cannot compromise an individual’s identity, let alone cause him or her any harm.
Calling All Clients
Let’s be clear what’s at risk here: the Internet, and any communications activity that depends upon it. Why? Because all Internet activity throws off such non-identifying “behavioral” data all the time.
Indeed, behavioral data is the center of the client/server call process
that’s the essence of the Internet’s architecture, which delivers
content based on information generated by user activity. As IAB Vice
President for Industry Services Jeremy Fain, one of the interactive
media industry’s top operations experts, puts it: “A client calling a
server asking for content, and the server sending it back, is the
fundamental underpinning of the Internet.”
Put
this vital piece of the Web’s infrastructure under Government control,
as the activists suggest, and the ad-supported innovation that has
driven this communications revolution would be impaired. As Fain
explains, “Within the client request there are many pieces of
information, including a cookie, an IP address, and a user agent
string. Cookies could be stripped out of that process, but the Web
experience would change drastically. Cookie IDs are essential to user
experience; as Wikipedia nicely observes, cookies give a state, a ‘memory of previous events,’ to otherwise stateless HTTP transactions.”
Without cookies, each new page view would be an isolated event. The Web's relevancy engine would
disappear. A news site wouldn’t be able to give you recommendations for
articles you might want to read based on earlier things you’d read.
Click analysis would be impossible, so retailers and brands would not
be able to understand how their customers are using their sites. There
could be no logged-in state beyond a single session, making it
necessary for a user to log in to every site each and every time he or
she visited. For retrieving email this wouldn’t be a big change, but
for any news or entertainment sites that require a registration, any
blog, any social media site, this would change the experience
dramatically.
“IP
addresses can't be stripped out,” Fain continues. “They are fundamental
to the delivery system -- both client and server must know where to
send the information. User agents – which are basically the
identification of the browser type -- should not be stripped out,
either. Besides being fundamental to conducting any business online --
they are the best way to distinguish human activity from
machine-generated activity, and accurately count how many times content
was delivered to real people – user agents are essential to delivering
better experiences. At first glance, user agents may seem tangential to
the consumer data and information discussion, but decisions are made by
Web sites based on a consumer’s user agent strings all the time. A
person using the Safari browser will regularly see something different
from someone using Internet Explorer.” (An excellent example of the
importance of user agents is the mobile Web: Page views will be
optimized for the smaller screen based simply on the server’s ability
to know that the consumer is using a mobile-device browser. )
“This
is behavioral information,” Fain says, “but if companies cannot collect
or store it, they cannot make business decisions on how to optimize
their sites for their viewers. “
In
other words, the Internet runs on behavioral data. When a user launches
her browser, behavioral data is generated that gets her to her
designated home page. When a user clicks on a de.licio.us bookmark,
behavioral data is generated that whisks him to the site. When users
click on an article’s “go to next page” button, behavioral data is
generated that positions them on the next page – in pretty much the
same way a click on a “skip this ad” button will assure they don’t get
the advertisement they don’t want to see. Under no normal circumstances
is this behavioral data connected to an individual’s name, address,
Social Security number, or other information we would conventionally
associate with personal identity.
Government Control
Sadly,
this doesn’t seem to matter to the activists, because under the rules
they are pushing in Congress, this impersonal string of otherwise
meaningless symbols would still be classified as “their data,” and
subject to Government regulation. And with that change, control of
media and commerce would pass from the private sector to the Feds.
Think
I’m being overly inflammatory? In addition to the obvious damage to
interactive content customization and relevance, consider what else
would be placed at risk if this de facto, all-encompassing definition of “behavioral data” were to become de jure:
Bar-code scanners used at checkout counters.
With ownership of impersonal consumption data legally enshrined as
consumer property, this crucial component in retail supply-chain
management could become unusable – at least if the data they collect is transmitted over the Web. Internet-based supply-chain management systems employing RFID tags could similarly be compromised.
Lists of “most e-mailed stories” in newspapers and magazines. These popular features
– and vital editorial management tools – could become illegal under the
proposals floating around Washington and the states, for they depend on
aggregated behavioral data.
Search-engine competition.
Kiss goodbye any efforts by its competitors to compete with Google.
Whether small fry like Cuil or giants like Microsoft, their ability to
take data to optimize their own processes or experiment with new
algorithms would be gone. So would the search-engine optimization and
search-engine marketing industries, too.
Social science research.
Academics interested in observing, say, the effect of health
communications on Americans’ behavior would be restricted from
utilizing the anonymous data generated by the billions of interactions
daily between Web users and content. Even the American Psychological Association,
which recommends “informed consent” as a standard in most research,
recognizes that some forms of research, including anonymous
questionnaires and “naturalistic observations” in cyberspace, don’t
necessarily require it. Some legislators and activists, though, want
their judgment to supersede the scientists’.
Journalism and commentary.
If people own “their data,” publishing observations of their activities
online – how many times a video was watched, how many members of a
social network enjoy Cream of Wheat for breakfast, what people are
saying about that new Carmen Diaz movie – would fall into a legally
murky area. Remember, California , for 20 years, has accorded people “personality rights”
that prevent the unsanctioned use of anyone’s “name, voice, signature,
photograph or likeness on or in products, merchandise or goods.”
Extending this right to “their data” is basically what the
anti-Internet proposals envision. In a profound, First
Amendment-grounded critique of the FTC's proposals against behavioral
marketing, the Newspaper Association of America
wrote, "The fully protected rights of news publishers are at stake. A
limitation on behavioral targeting would directly affect the selection
of content that is presented to readers."
Branded-media and small-publisher growth.
Many major media companies are hooking their futures on the opportunity
to gain more reach by constructing large networks of affiliated sites,
whose content and demographic affinities would be abetted by
network-based ad delivery. Ban the use of anonymous behavioral data,
and these enterprises comes tumbling down. So does network-based
advertising support for small publishers, which underpins the economics
of tens of thousands of sites.
Real Crimes
It’s ironic that I’m writing this only a week after the U.S. Government broke what The New York Times described as a global, criminal “cyber-ring”
that “plundered the credit card numbers of millions of Americans.” Such
threats – to family information, financial records, health data – are
real. In fact, exposure of such crimes pretty much requires the
retrieval and storage of user string agents and other behavioral data
by e-commerce providers and other sites. But instead of zeroing in on
real crimes and real harm, aggressive legislators, regulators, and
their champions seem hell-bent on grouping under the same regulatory
regime sensitive identifying data and the kind of impersonal behavioral
data necessary to run the Web.
I – and my colleagues at the IAB – have been sounding these alarms for more than a year. I’ve testified before the Federal Trade Commission,
and the House Small Business Committee. Yet the call for regulation
grows bewilderingly louder, from elected officials who have specified
no harm and conducted little research. Even the militant Center for
Democracy and Technology, which has declared that “concerns about behavioral advertising practices are widespread,” recorded
zero consumer complaints filed with states’ attorneys general in
2006-2007 over privacy violations involving behavioral targeting. Zero!
Indeed, in its just-released 37-page report Online Consumers at Risk and the Role of State Attorneys General,
which documents thousands of cases of Internet-related sales fraud,
spyware, phishing, data security breaches, and child solicitation, the
word privacy comes up only once – in a Texas case filed against two Web
sites that allegedly failed to protect “the privacy and safety of
minors.”
Such violations are already covered by existing law (in this case, the federal Children’s Online Privacy Protection Act,
or COPPA) but the CDT, asserting that “behavioral advertising poses a
growing risk to consumer privacy,” wants “a new general privacy law
backed up by regulatory enforcement.”
It’s
time for CMO’s, media company CEO’s, technology entrepreneurs,
free-press advocates, independent Web publishers, retailers, e-tailers
and others who depend on robust Internet communications and a thriving
free media to stand up and let the world know where these
recommendations are explicitly heading: Toward a Government takeover of
the Internet, and a silencing of the diverse voices that make up the
Web.
Leave a comment