Just a day before we got the news this week that Authonomy was to be closed by HarperCollins UK, London-based entrepreneur Andrew Rhomberg had posted at Digital Book World his Fear of Data column. In that piece, Jellybooks' Rhomberg writes:
The availability of reading data is probably causing more angst than any other because it strikes at the heart of publishing— everything from acquisition to editorial to marketing to author care.
Rhomberg is talking about the kind of data that can be gathered now, and is being gathered now, on readers' patterns of behaviour. Reader analytics in the matrix of publishing.
Reading data is becoming available in aggregated form from conventional pay-per-ebook retailers, like Kobo, but also from all-you-can-read subscription services, like Scribd and 24Symbols—two companies that presented some of their data at the recent IDPF/BEA conference in New York.
Having programmed that conference for IDPF at BEA in New York in May, I can tell you that we were, yes, interested in just some of these considerations. We were glad to have Rhomberg join 24Symbols' Justo Hidalgo, whose mobile ebook subscribers include some of the newest to the Web, with his work with Mark Zuckerberg's Internet.Org programme, on a panel with Trajectory's James Bryant and Scott Beatty, whose Natural Language Process algorithms map reader response on myriad vectors to aspects of literature for recommendation-generation.
Andrew Weinstein of Scribd — which has recently used reader data to decide to limit the availability of romance in its offerings — spoke with two more companies that work with reader data, Peter Hudson's BitLit Media (which uses the "Shelfie" to analyze what's on a reader's bookshelf) and Kevin Franco's Enthrill Media (which puts ebook purchasing into physical stores).
And another company spokesman, Nathan Hull of Denmark's Mofibo, is outspoken on what sort of reader-patterning data an ebook subscription service like his can provide to publishers. Here he is in April telling us about that, here at The FutureBook, in Ebook subscription data is 'incredibly rich':
Our data is gathered from a continuously evolving reading environment where habits are formed. Not just purchase habits — but also frequency of reading, locations for reading, devices on which people read and much more. There’s layer upon layer of rich, contextual data that reveals an incredible amount about readers' behaviour. Just take the 1.2 million pages of books read every day on Mofibo, throw in the 600,000 minutes of audiobooks listened to daily and imagine the possibilities of that combined scale.
"The possibilities of that combined scale" are being provided to Mofibo publishers, Hull says, forming what he sees as one of his company's greatest value points for those publishers.
And what do you think about all this collection and sharing of reading data? Is Rhomberg right that we have a certain fear of it that can prevent some in the industry of taking full advantage of its potential? If so, is that fear well-founded? Or more hysteria on the part of a change-wracked, strung-out industry?
That's what we're talking about in #FutureChat today, and we'd like your input.
This story was written as the walkup to our #FutureChat of 21 August. Join us each Friday live on Twitter at 4:00 p.m. London (BST), 3:00 p.m. GMT, 5:00 p.m. Rome (CEST), 11:00 a.m. New York (ET), 10:00 a.m. Chicago (CT), 9:00 a.m. Denver (MT), 8:00 a.m. Los Angeles (PT), 5:00 a.m. Honolulu (HAST).
Indeed, in Rüdiger Wischenbart's fine Publishers' Forum in Berlin from Klopotek (the 2016 dates, he tells me, are 28 and 29 April), Hull was with us on a panel to talk about how the search for the most passionate readers. In 'Publishing Goes Pop,' Part 2: Is A Fan A 'Quantified Reader'? at Thought Catalog, I wrote about Hull's comments on how he sees the search for the "fandom" of publishing being aided by the kind of data that his Mofibo team can offer:
This may make some people cringe. And I’m not saying that it can create the perfect crime novel. But…some of the data we supply shows the reading speed at which people are moving through a book. In the parts of the book in which you’re establishing the characters and building up the plot, people tend to move very quickly. Moments of high drama and high suspense, they’re eating it up, they’re going right through. But when it get a little bit dry and uneventful, we can really see that people are a lot slower. We also can see that they tend to skip sections. That’s just one book. But think of that as a genre. We can put layer upon layer upon layer of such data into this. A lot to play on emotions and tone. That’s what we’re working on next.
The interesting place to which Rhomberg is taking this issue has to do with manuscript acquisition and the actual bases for publishers' lists. In his DBW piece, he writes:
One fear is that reading data will influence what gets published. This is a somewhat strange notion, as self-publishing has already removed almost any barrier to market, and more books than ever before are getting published. What the doubters really mean is that data could influence what the big publishing houses will publish in the future, such as more celebrity biographies, tales from YouTube stars and vampire novels. The fear is that, in the future, worthy books of high literary quality will be shunned.
Rhomberg, whose latest work has a lot to do with reader analytics, is unsurprisingly comfortable with these issues. He sees the data gathered from readers' patterns of behaviour as a positive for literary work, providing — as I think Hull would suggest — a way to find the right readers for it:
Titles are getting acquired by editors even when sales data point toward the reality that they are never going to deliver a positive return. What those who have reservations about data fail to see, however, is that data will make it easier to find the audience that appreciates these books. Rather than support an expensive marketing campaign across mass retail, publishers can tailor their campaigns to the relevant audience by virtue of an improved understanding of who likes to read a certain kind of book. And just as important, publishers will also discover the optimal approach to reach that audience.
And regular #FutureChat participants might remember being joined in June by Jim HInks, who was launching his MacGuffin operation at Comma Press, with a component that would not only tell a writer where a reader stopped reading but would also tell everyone else that same thing. He calls this "open analytics." Others might call it way too revealing:
Perhaps most controversially, MacGuffin has open-analytics. Anyone can see the "drop-out points" — where (anonymised) readers quit a story or poem before the end — plotted onto a graph. While I expect that many writers will find this unnerving, it undoubtedly has some potential as a self-editing tool – you can identify that weak scene in your story where you're losing readers, then republish it.
Gaming The System(s)
The soon-to-be-defunct Authonomy was, when you think about it, based on reading data. In that programme, member-authors could vote on various submitted manuscripts and the outcomes of those votes were to determine which manuscripts might get a beneficial look from HarperCollins editors. When my Bookseller associates Lisa Campbell and Sarah Shaffi looked for reactions to news of the closing, they were told by Scott Pack, formerly publisher of HarperCollins’ The Friday Project:
The big problem, though, was that users learned how to 'game' the site so that, after a year or two, the manuscripts rising to the top, and therefore being passed on to editors, were rarely the best and were often pretty poor. It is hard to keep editors engaged when that happens.
And there some might say, is another issue with reading data. It might be manipulated, right?
Rhomberg seems to get close to this point in his own write:
The greatest crime, admittedly, is the abuse of data. As the saying goes, if you torture the data long enough it will confess. The selective use of data to confirm a decision already made can be extremely dangerous. When we follow this path, not only are we deluding ourselves, but we are misleading others, too. Sadly, it happens all too often.
But there's little doubt that he's right: We're headed only more deeply into a world in which data is going to be available and ever more quickly used to justify one thing and another. Rhomberg says how well that all happens is up to us:
Data only informs; it doesn’t decide. Data merely helps us humans make better decisions. Even in situations where machines make decisions, it is rules developed and coded by humans that teach the machines how to decide. Most of the time, we don’t have data to help us make good decisions, so when it is available, we should welcome it. The need to weight different inputs and make choices will never go away.
Is the Rhombergian rumination right?
- Do you welcome our data-totin' overlords?
- Or are you more worried that things could go very wrong with data constantly being tossed from one set of hands to the next?
- Have you seen some element of reading data used ineptly or purposely to the wrong end already?
- Is sales data — not that we'll ever see any about ebooks from the biggest retailers — not a form of reading data? Aren't readers "voting with their feet" when it comes to sales? And aren't sales figures used all the time to justify one direction or another in any of myriad elements of publishing activities daily?
Your turn to talk. And we're tracking everything you say and do, too, so watch yourself. (Just kidding. I think.)
See you in #FutureChat.
Join us each Friday live on Twitter at 4:00 p.m. London (BST), 3:00 p.m. GMT, 5:00 p.m. Rome (CEST), 11:00 a.m. New York (ET), 10:00 a.m. Chicago (CT), 9:00 a.m. Denver (MT), 8:00 a.m. Los Angeles (PT), 5:00 a.m. Honolulu (HAST).
And hey. Where's your manifesto? Well?
Please remember that we're interested in having your manifesto for The Future of the Book Business. Your statement, preferably no more than 500 words, should be sent to Porter.Anderson@theBookseller.com by 31st August.
And mark your calendar for The FutureBook Conference, 4th December, The Mermaid, London.
More details are coming soon.
Main image - iStockphoto: agsandrew