Well Facebook beat me to it. For a year I’ve been collecting thoughts and notes under the title A New Page Rank for the Era of Fake News. What got me off the schneid was last week’s post by Mark Zuckerberg on changes to the news feed algorithm.
But let’s back up a bit. My first note on this topic was a link to Horace Dediu talking about the PageRank algorithm on Episode 197 of the Critical Path. Criminally over-simplifying, the genius of PageRank1 was ranking search results based more on incoming links from other pages than on a page’s content or metadata. To me at least, this idea has always felt like a natural extension of the citation system in scholarly writing, where the most cited papers tend to be seminal or canonical works.
This system has flaws, but it’s amazing that an even half-passable system of credence lending emerged from the massively decentralized institution of writing — and even more amazing that it extended to the world-wide web.
The problem with this system now, as Horace goes on to articulate, is that it’s predicated on the wisdom of crowds. When the crowd is made up of people willing to read obscure2 scholarly papers, the crowd ends up being fairly wise. But now everyone uses the internet, and as my friend and I like to joke: People are stupid.3 For boring topics, like the birthplace and ancestral home of Winston Churchill, a naïve PageRank still gives accurate results, but fake news is an example of how it can catastrophically fail when the topic is both popular, and politically divisive.
My first thoughts on how to combat the problem came after listening to an episode of Star Talk, where Neil Degrasse Tyson longed for a way to rank the integrity of information in a search, but my first actual idea came after reading this Seattle Times article on a UW professor’s research into filter bubbles. It was this part in particular:
Starbird argues in a new paper, set to be presented at a computational social-science conference in May, that these “strange clusters” of wild conspiracy talk, when mapped, point to an emerging alternative media ecosystem on the web of surprising power and reach.
Some topics toe the line between being popular enough to start a hashtag but obscure enough so the first search result isn’t wikipedia. Fake news operates in that gray area where an organized group can overwhelm the rest of the internet and look, at least to a first-gen PageRank, like scholarly consensus.
But actual news — information with integrity to paraphrase Neil — should span bubbles. Real news isn’t just popular within a cluster; it spans the gaps between clusters. Our new algorithm, one designed for an era of fake news, needs to exploit this trait, and it sounds like this is exactly what Facebook is trying:
Here’s how this will work. As part of our ongoing quality surveys, we will now ask people whether they’re familiar with a news source and, if so, whether they trust that source. The idea is that some news organizations are only trusted by their readers or watchers, and others are broadly trusted across society even by those who don’t follow them directly. (We eliminate from the sample those who aren’t familiar with a source, so the output is a ratio of those who trust the source to those who are familiar with it.)
The important part was clarified by the Head of News Feed right away:
The headline is misleading. We ask people what they trust, but don’t simply value more the publications that get positive replies. We specifically look for publications that are trusted by people with a wide range of reading habits, so trusted by many different types of people.— Adam Mosseri (@mosseri) January 20, 2018
Facebook is looking for integrity to come from sources whose reach, influence, and reputation consistently span divisions. It’s a good idea, and I wish Facebook the best of luck in popping bubbles.
I started this note as a snarky comment wondering how many people think PageRank is named after Google co-founder Larry Page, rather than as a description of the algorithm’s function: a way to rank pages. In a single turns-out, at least according to the one true source of knowledge, it is named after Larry. This is why we research things kids. Try never to be an asshole, but if you’re going to be, be damn sure you’re right. ↩
Is there another kind? ↩
Yes, including us. As an aphorism it’s a cynical take on Occam’s razor; a response to the rhetorical question: “Why would blank do that?” ↩