How BookVibe used Twitter to predict the Man Booker Prize winner

How BookVibe used Twitter to predict the Man Booker Prize winner

Prior to 14th October, Richard Flanagan’s The Narrow Road to the Deep North (Chatto & Windus) was an outsider to win the 2014 Man Booker Prize. The bookmakers – who, up until now have been our best method of predicting who will take one of the biggest prizes in literature – were offering odds on Narrow Road of 3:1, compared to 5:2 offered on the favourite, Neel Mukherjee’s The Lives of Others (also Chatto & Windus).

This means that if you’d put £50 on Flanagan taking the prize, you’d have won £200, as opposed to £175 if you’d plumped for Mukherjee. Potentially a tidy profit, but the stakes for authors and their publishers when it comes to winning the Booker are several hundred degrees larger.

As Joshua Farrington reported at The Bookseller, "The Narrow Road… shifted 10,242 print units following the award, a massive 3,141% week-on-week sales hike, good enough for 11th place on the overall chart—and second in Original Fiction—the first time Flanagan has ever hit the Top 50. The £137,430 the book earned last week through the TCM eclipsed Flanagan's combined BookScan sales for the previous 10 years."

The prize has a transformative effect on the careers and finances of literary novelists. Last year’s The Luminaries (Granta) had sold only modestly on its initial publication, but has since gone on to shift 560,000 copies worldwide. Meanwhile, since taking the prize in 2003, Yann Martel’s The Life of Pi (Canongate) has sold 3 million copies, its film adaptation taking more than half-a-billion dollars at the box office. It has changed the fortunes of its publisher along the way. Heady stuff.

Imagine it were possible for publishers and booksellers to tell which are the books that successfully achieve that knife-edge balance between literary accomplishment and commercial success. It could have powerful implication for the publishing industry. And we think it might just be possible to do that by mining the data publicly available on Twitter.

We decided to test this theory

We gathered hundreds of thousands of tweets that mentioned the six books on the Man Booker Prize shortlist. Then we created an algorithm that looked at this data, aggregating what was being said about each book by whom and in what quantity to see if it were possible to use this information to predict the outcome of the prize. And we were surprised at the accuracy of the results.

The first thing our algorithm considered was the raw bulk of tweets that mentioned each shortlisted title from the shortlist announcement onwards, as shown in this figure. 

Here we made some exclusions. We stripped out promotional tweets (eg book giveaway competitions) on the basis that these risk skewing the results. Our algorithm also gave lesser importance to tweets from corporate accounts, which tend to be driven by marketing messages and to RTs, which require less engagement from the tweeter than a unique tweet.

In tweet volume alone, To Rise Again at a Decent Hour (Penguin) by Joshua Ferris led the other five books on the shortlist by a significant margin. It was followed by We Are All Completely Beside Ourselves (Serpent's Tail) by Karen Joy Fowler, which was the shortlist’s star commercial performer, selling 20,000 copies during the shortlisted period.

Flanagan came in at third place in our tweet volume rankings, but enjoyed two small but significant upswings at crucial moments that convinced us this was a title made for greater things.

  • The first upswing in tweet volume for Flanagan took place between 14th & 16th September when he vied for first place with Ferris.
  • The second happened on the eve of the announcement itself on 6th October. In that 24-hour period The Narrow Road showed the greatest acceleration in tweet volume any title on the shortlist. This suggested strongly to us that the momentum was with Flanagan.

The second factor our algorithm looked at was sentiment. This is a difficult thing to measure on social media, especially with a book like The Narrow Road, which deals with horrifying subjects that don’t naturally lend themselves to positive language. Consequently we had to tweak our algorithm so it didn’t strip out false negatives.

With Narrow Road, however, we found that tweets about the book were more likely than the other shortlisted titles to contain superlative language like “Totally mesmerising", "I'm floored by the writing", "Just broken my heart into pieces". Quite simply, those people who had read the book enjoyed it more, and enough to tweet unqualified praise about it.

The third and perhaps most crucial factor for Flanagan was the reach and influence of people tweeting on his behalf as fans of the book. Flanagan has been a critically acclaimed writer for many years -- “the best Australian novelist of his generation,” according to The Economist -- but until this month praise has not been matched with sales. What Flanagan did have on his side during the shortlisted period was the advocacy of those influential people who already enjoyed his writing.

At the level of individual tweets, we found that the people who tweeted to praise Narrow Road were more likely to have an above average number of followers. They were also accounts that interacted with other people with large number of followers, and thus could be counted as ‘influencers’. Flanagan also enjoyed the support of a number of celebrity tweeters. For example, one recommendation for Narrow Road from US Sen. John McCain generated a raft of activity for the book.

When we aggregated this activity (see the figure) we saw that while To Rise At A Decent Hour had inspired more unique tweets, tweets about The Narrow Road reached more people. Flanagan’s influential cheerleaders meant that positive tweets about his book reached 4,000,000 users on Twitter, compared to around 3,000,000 for To Rise at a Decent Hour.

All this analysis led us to predict on our blog on 7th October that Flanagan would take the prize. His was the book that seemed to have most momentum behind it; it inspired the most fulsome praise among readers; and it enjoyed the advocacy of the kind of celebrities, writers and critics who in other years would be sitting in the judge’s chair themselves.

Therefore when the result was announced on 14th October we thought we might be on to something here.

We’re currently applying the same algorithm we developed for the Man Booker to the National Book Awards in the US. It’s early days, but we’ll let you know how we get on.

William Pearce is head of growth with, a Parakweet company that uses "advanced natural language processing algorithms to accurately extract entities from the massive amounts of data flowing through Twitter and other social streams."

Main image from Man Booker Award by Janie Airey