The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
In our second post laying out the arguments in our new paper, The Second Digital Disruption: Data, Algorithms & Authorship in the 21st Century, we focus on the rise of Netflix — one of the streaming platforms that has employed user data to shape the new content it produces. As we shall see, Netflix has been incredibly successful at the new enterprise of "data-driven authorship." But there is a video streaming platform, unknown to most people, that arguably is even more successful than Netflix in this new game. More on that in our next post. For now, let's focus on Netflix.
Netflix, once a small firm that sent out DVDs in red envelopes, is now one of the largest studios in Hollywood. And it is the poster child for the rise of streaming video content.
Netflix has grown enormously thanks to a revolutionary (for mainstream media) marriage of streaming and data. While the exact extent of Netflix's reliance on data versus old-fashioned intuition is a matter of intense speculation, it is clear that data is the secret sauce that has allowed Netflix to grow incredibly powerful in a matter of a just a few years.
Streaming television and film has evolved in about a decade from a fringe activity to the mainstream. According to a recent Deloitte survey, 55% of American households now subscribe to at least one streaming service, an increase of 450% over 2009. The average subscriber has three services they pay for; the average millenial, four.
Among streaming services Netflix is the largest and the one that spends the most on original content—some $12 billion is projected in 2018. But Amazon ($4.6 billion), Hulu ($2.5 billion), and Apple ($1 billion and rising fast) are players as well. Apple and Amazon in particular have the means to outspend all the competitors should they choose to.
In the years before it began creating content Netflix struggled in the grip of Hollywood studios and networks. They demanded high licensing fees for their content, constraining Netflix's profitability. Beginning in 2011, as Netflix moved from physical distribution to streaming, it set out to solve that problem by producing its own content.
At first, the prospects that Netflix would emerge as a major producer seemed dim—Netflix was a technology firm, and had no track record in producing programming. But just seven years later, Netflix has emerged as a programming giant.
How did Netflix succeed so quickly in the programming business? A key part of its formula has been intensive collection and analysis of its users' viewing data. The details of Netflix's data mining began to be widely reported following the runaway success of the Netflix-produced House of Cards, which was the most streamed piece of content in the U.S. in 2013.
After extensive analysis of viewing patterns, Netflix concluded that fans of the original U.K. series on which House of Cards was based would also be deeply interested in content that either starred Kevin Spacey or were directed by David Fincher. With the data drawing out three overlapping circles of viewer interest, Netflix determined that the intersection of the Venn diagram—a House of Cards reboot starring Kevin Spacey and directed by David Fincher—would be a good bet.
The company was so confident in the series' prospects that it committed $100 million for two 13-episode seasons–a decision considered both rash and profligate by many insiders at the time.
In a 2013 NY Times interview, Jonathan Friedland, then Netflix's Chief Communications Officer, detailed the extent of Netflix's data strategy. Friedland said that the service examines more than 30 million plays per day (a number that has surely gone up dramatically) and uses data from these user interactions both to gauge consumer interest in new programming and to determine how best to categorize and present existing programming.
At about the same time, Friedland told Wired that "[w]e're able with a high degree of confidence to understand how big a likely audience is for a given show based on people's viewing habits." Indeed, Netflix has boasted that about 75% of user viewing is attributable to its recommendation algorithm, which is itself continually reconstructed with what the company learns from its analysis of user interaction data.
Netflix builds out millions of user profiles with metadata based on what it calls "user actions." Some parameters that Netflix tracks include pause/rewind/fast-forward behavior; day of the week; date of viewing; time of viewing; zip code; preferred devices; completion rate; user ratings; user search behavior; and browsing and scrolling behavior. A 2013 article describes the process in more detail:
"I hit the pause button roughly one-third of the way through the first episode of "House of Cards," the political drama premiering on Netflix Feb. 1. By doing so, I created what is known in the world of Big Data as an "event"—a discrete action that could be logged, recorded and analyzed. Every single day, Netflix … registers hundreds of millions of such events. As a consequence, the company knows more about our viewing habits than many of us realize. Netflix doesn't know merely what we're watching, but when, where and with what kind of device we're watching. It keeps a record of every time we pause the action—or rewind, or fast-forward—and how many of us abandon a show entirely after watching for a few minutes."
Netflix also uses an extensive categorization system. In 2014, The Atlantic scraped every tag in Netflix's system, and found 76,897 descriptors for movies, ranging from "Romantic Indian Crime Dramas" to "Post-Apocalyptic Comedies About Friendship."
In line with this, a major part of Netflix's content strategy is its use of categories to organize consumer preferences. Netflix categorizes its huge subscriber base into "taste clusters." There are over 2000 clusters, and unlike traditional advertising demographics (e.g., "single white women 18-24") they are not organized by ascriptive characteristics but by viewing habits.
Taste clusters shape creative choices—Netflix designs its programming to appeal to particular clusters. But they also shape an important element of what a subscriber sees on her screen—the small square titles that suggests what you might like, which Netflix calls "row art." And not only does it determine which options are presented, but how they presented; i.e. which of the several row art options for each offering comes up on your screen.
Netflix's use of data in these ways is not unique: similar things can be said about other streaming services. In our next post, we will turn to the king of streaming data: a firm most of us have never heard of: Mindgeek.