It’s no secret that Netflix home pages are tailored according to past viewing patterns. What you see depends on what you’ve watched, which is why your interface might look a lot different than the one your spouse or neighbor sees when they log on. Perhaps not as widely known: Some of the categories and content rows that might seem objective are actually just as personalized. “Popular on Netflix” and “Trending,” for example, sure sound like Nielsen-like rankings of what’s hot in the U.S. or the rest of the world. Turns out they’re not.
Mariam Braimah, lead product designer for the Netflix TV app, says such categories are “actually personalized content that also happens to be popular.” In other words, Netflix figures out the shows you’re most likely to enjoy, and then tells you which of those titles are currently getting a bunch of streams. If you’ve watched a lot of true-crime shows, then there’s a good chance The Vanishing at the Cecil Hotel was “trending” for you when it came out earlier this year. But if you’re super into comedies, Schitt’s Creek and New Girl are going to be popping up in that row a lot. The streamer isn’t alone in using fuzzy definitions to define popularity: Twitter has long customized its trending topics in part based on who users follow on the service.
But lately, as part of the same push to improve discovery that led to the just-launched Play Something shuffle feature, Netflix has been exploring whether giving members truly objective data about what’s popular on the service might be another effective way to drive increased viewing. Take those lists of the top-ten movies and series that began showing up in Netflix’s Browse section in early 2020: They generated a ton of publicity and buzz, since they marked the first time the streamer had publicly and regularly revealed how its titles were performing. But the top tens weren’t about Netflix suddenly being more transparent, or even just trying to drum up some extra PR.
Instead, the lists were the result of the company realizing that the joiners among us are sometimes more inclined to check out a show or movie if they know a lot of other folks are also into it. “Some people, what they really want to see is, what is everyone watching?” explains Todd Yellin, Netflix’s vice-president of product. “They want to be in the conversation: ‘Everyone’s talking about Queen’s Gambit. I see it’s No. 1 on Netflix. Damn, I want in on that action.’ So now, to complement the personalization [of the algorithm], we also highlight popularity.”
Like almost all big changes to the Netflix user interface, the decision to supplement the company’s prized recommendation engine did not happen overnight. The streamer began quietly testing the value of objective, non-algorithmic-based discovery tools several years ago. Back in 2018, for example, it ran an experiment to see what would happen if it un-personalized the aforementioned “Popular on Netflix.” Rather than seeing titles tailored to their tastes, users in a handful of markets were quietly fed rows of shows which actually were the most streamed titles in a given market. “Everybody would be seeing the exact same order,” Braimah says. Netflix also changed the title of the row so that it included a member’s country name — “Popular in South Korea” or “Popular in the United States.” It was a very subtle tweak, and at first, the response from subscribers was similarly muted. “We didn’t hit out of the park,” she says.
While it wasn’t a home run, Braimah tells me this early test sparked just enough “excitement” among members to warrant staying on the path toward non-personalized data. She and the product team regrouped, deciding the design for presenting objective data needed to be a bit more bold. In 2019, they ran a test in the United Kingdom, as well as one other smaller market, in which a ranked list of the most popular content replaced what had been a more ambiguous listing. This was basically the top-ten lists Netflix has now, only in this early iteration, the rankings were only tabulated once a week. This turned out to be a problem. “We noticed when we were refreshing it weekly, you wouldn’t see titles moving from day-to-day,” Braimah says. As a result, users didn’t have a reason to engage with the feature that often. So Netflix modified the experiment yet again, this time updating the top ten every 24 hours and reworking the graphic design. The results were better: Users were discovering a wider variety of shows and streaming longer. Testing continued for a few more months, and by February 2020, Netflix had seen enough. Top-ten lists rolled out around the world.
How real is it?
Given how famously opaque streamers — including Netflix — have been about releasing performance data, the introduction of the top-ten lists has not been greeted with universal acclaim. For one thing, Netflix admits its rankings aren’t based on the average audience size of a title, the way Nielsen measures linear TV consumption. Instead, the streamer uses what it calls a “chose to watch” standard, which tallies how many people sample at least two minutes of a given title. Netflix opted for this metric because it evens out disparities in program length and episode count. “We had to level the playing field,” Yellin explains.
There’s logic in that: Nielsen’s streaming ratings are flawed in part because they measure millions of minutes consumed, giving an hour-long show with hundreds of episodes (Grey’s Anatomy) a massive advantage over a newer title with just a fraction of the creative output (such as Special, whose first season was barely two hours long in total). Still, there are other, more detailed metrics Netflix has that would offer a more accurate picture of actual content consumption and popularity. In 2018, execs told me they cared deeply about how many people finish a full season of a show within 28 days of release, for example. But the company doesn’t make that measurement available because, well, there’s really no upside to being that specific.
The downside of such fuzziness, however, is that it raises questions about whether the top-ten lists are actually just a marketing tool designed to help Netflix promote its most expensive, high-profile content. Rightly or wrongly, a subsection of Netflix subscribers loathe the company’s reliance on an algorithm-focused user interface. Some are convinced that the company’s sophisticated computer programs too often hide content they’d actually enjoy. Producers have also been known to blame the failure of their Netflix shows on titles being lost in the algorithm. Could Netflix’s top-ten just be a gimmick designed to get more eyeballs on the streamer’s most expensive, high-profile originals?
Yellin politely but firmly pushes back against such concerns. For one, he believes the Netflix algorithm is too often misunderstood and oversimplified. “It’s not like these machines have suddenly walked into our office and started figuring out what people are going to watch,” he tells me. “We have people looking at all our titles and fueling the selection by deep tagging — understanding what they are and what they stand for and the different attributes of it.” And Cameron Johnson, who oversees product innovation for the Netflix TV interface, says it wouldn’t make sense for the company to rig its algorithm to favor certain kinds of shows over others. “What we have learned is, continuing to show somebody a title that they’re not interested in doesn’t help anybody really,” he says.
As for the notion that the top-ten lists are somehow manipulated to help build buzz for important titles, “My whole job, and my team’s job, is to make a better service so people keep on subscribing, not because the world loves lists and top tens,” Yellin says. “If that was the only reason, we wouldn’t have done it. We did it because it makes it a little easier for a good number of people to find something great to watch.” Johnson is even more clear: “It’s a hundred percent objective,” he says of the top tens. “We just add up what shows got watched the most — and we do it by country— every 24 hours.”
That’s also the take from the content side of Netflix. Bela Bajaria, the streamer’s global TV head, tells me title rankings have helped funnel more viewers to important titles. “What we’ve seen is the top ten can make a show more talkable,” she says. “It gives viewers a reason to discuss it, and sometimes that conversation does lead to more sampling and viewing.” And Hayden Schlossberg, one of the executive producers of Netflix’s Cobra Kai, says the top-ten lists are a “nice resource” for viewers to have when navigating Netflix — even if they also bring a touch of anxiety for creators. “You try not to worry about it, but you can’t help every morning to click on it and see,” says Schlossberg, whose show got a huge surge in awareness when it switched from YouTube to Netflix last year, and immediately landed at No. 1 on the Netflix top ten. “It kept going up and down and back up, so you’re on this roller coaster that is reminiscent of having a movie at the box office. But at the same time, I think from a consumer perspective, it does make sense for people to see what is popular right now.”
Of course, the buzz generated by the top-ten lists can cut both ways. While showrunners and directors who find comfort in Nielsen rankings and box-office reports now have a public-facing metric to validate their endeavors, those numbers can just as easily cause distress if something doesn’t land high in the charts, or falls off too quickly. “The top ten giveth and the top ten taketh away,” Bajaria quips. “I think because it’s one of the most prominent signals of success we’ve had, there’s sometimes a tendency to overly focus on it. But when we talk with creators, we explain that it’s only one factor among many that ladder up into the performance of their show.”
Up next: sound effects?
Netflix isn’t done experimenting with its use of objective data on the platform. During a conversation on Zoom, Braimah walked me through a slew of new iterations of top-ten lists she and her team have spitballed in recent months. For example, instead of simply showing you rankings based on where you live, Netflix might one day give users the option of scrolling through what’s hot in different regions of the world, or even the ability to get hyperlocal (say, the top ten in Queens versus Brooklyn). Braimah has even thrown out the idea of tying the lists to current events: During the World Cup, Netflix could show you what was popular in all of the countries participating in the tournament.
Johnson, meanwhile, tells me Netflix has already run tests to see what would happen if it expanded the three current types of top-ten lists — shows, movies, and all titles — to include smaller categories such as kids, stand-up specials, or documentaries. They’ve also explored giving some folks a top 20 or top 50, or rankings of the most thumbed-up titles. “We’re going to be doing a bunch of experimentation,” he says.
Netflix also wants to make this new, more objective data stand out more. Late last year, it replaced the “Latest” tab on the TV browser with something called “New and Popular,” where users can find listings of recently released titles and what’s coming soon on the service, as well as all of the various top-ten lists. Right now, the section doesn’t feel much different from other parts of Netflix, but the product team is working on a project to change that by blowing up the platform’s standard page-design template. Testing on such a revamped section could begin within the next month or two. “We want to make this feel like you’re in a different space on Netflix,” Braimah says.
She shows me early mockups of various ideas, including one which mirrors a vertical Billboard Hot 100 chart and another which looks like an old-fashioned day planner. The most radical features a cable-news-like ticker, with the popularity rankings of various Netflix titles scrolling across the screen like stock prices. New and Popular might also get its own sound cue, so that when you clicked into the tab, you’d hear a few seconds of music — something similar to ESPN’s SportsCenter riff or the NBC chimes. There might even be an opening animation to welcome you into the section, similar to the stylized “N” that plays at the start of all Netflix originals. “We’re trying to push those boundaries,” Braimah says. The Netflix interface “doesn’t need to be always static.”
She means that quite literally. A number of the concepts Braimah is working on are, in Netflix parlance, “video forward,” which means they’re loaded with clips and sizzle reels which begin playing as you scroll through a section. Netflix is treading carefully here: The company’s own research shows that while about half of its members enjoy a TV-like browsing experience, the other half would rather have their eyes gouged out than be subjected to any sort of autoplaying content. (That’s one reason Play Something is an opt-in experience.)
Figuring out an approach that avoids pissing off either camp has been one of the central challenges of this project, and it’s why some of these ideas won’t even make it to the testing phase — let alone your living room. Any revision of the Netflix interface has to clear a high bar, because, as multiple staffers keep reminding me, nobody wants to muck up a user experience that generally works so well. “That’s why we do the testing,” Johnson says. Netflix wants “to understand, okay, is this something that people are using a lot [and] getting a lot of value out of? It sort of earns its place in the product.”
And how exactly does Netflix know if users are digging a new feature such as the top tens or Play Something? Mostly, the same way it judges if its shows are working: hours spent streaming and user retention. “We see an increase in streaming, that means more people have watched more hours because of this feature,” Braimah says. “If we’re seeing more people stay members longer, that’s a proxy to understand if people are satisfied with Netflix.”
In the case of product testing, Netflix also drills down further. It looks to see whether a change makes it more likely users will connect to new content rather than, as Braimah puts it, “just scrolling through the home page.” No detail is too small for testing. Braimah shows me two early takes on top ten with just one difference, namely how bold the numbers were next to each title. Spoiler alert: Bolder wasn’t better.