Data and disruption in publishing: knowing your nodes

Data and disruption in publishing: knowing your nodes

On December 1st I'll be hosting a panel at the Futurebook conference in London on the subject of using data to 'capture and keep readers.'

I'll be joined by Albert Hogan (group marketing, audience & digital development at Penguin Random House), Thad Woodman (co-founder and chief product officer at Inkshares), Julia Porter (director of consumer revenues at The Guardian) and Annie Stone (international account manager at Bookbub). All great people - it promises to be an interesting talk.

As host, I get to choose the line of enquiry, which is good news for me because the question of how we use data has become more important over the last few years in the work I do with all my publishing clients.

The narrative arc has gone like this:

Stage #1 - 2010-13: Big Data: Lots of technology people get excited about the exponential impact of Big Data. Us mortals nod and smile and then get back to work in the real world - smug in the knowledge that, yes, fiddling around with Google Analytics on a Friday afternoon does pass for 'data science' in some circles.

Stage #2 - 2013-16: Data Business & Digital Transformation: We begin to put two and two together when we think more deeply about the success of companies like Amazon - and the extent to which they are fundamentally 'data businesses' above and beyond any other services they may appear to be (retailer, logistics network, media giant, cloud computing monolith). In turn, the publishing industry has a proper freak out, en mass, about the customer-persuasive powers of Amazon and the sincere need for root-and-branch 'Digital Transformation' of the publishing process - as a means to prevent being eclipsed.

Stage #3 - 2017: The Robots Are Coming: While many publishers have successfully branched out into new digital formats and customer engagement strategies, no techno-wonk worth his or her salt can help talking about concepts like 'automation,' 'algorithms,' and 'machine learning' (with bonus points for dropping in a 'neural network’ or two). At which point, for publishers, 2 + 2 begins to look a lot like 7,151,238,587, thanks to visions of self-aware robots who can write better blockbusters than JK Rowling, with no help whatsoever from a human editor.

In amongst the hype cycles, something important is happening to every business. Aside from the doomsday headlines that suggest robots will replace us all, it’s stage #2 - the data-fication of everything - that’s having the most impact, setting us up quite nicely for the positive AI-driven stuff of tomorrow.

The rise of the robot data engineers

Outside of publishing, other sectors have moved fast to embrace the data ethos.  Financial services and banking companies have established 'Data Engineering' teams whose sole job is to create and maintain a data set drawn from multiple sources - product data, customer data, market data, supplier data, etc - and figure out ways to put it to use in the form of new products or services. The same teams now exist in retail, as an extension of the people who first ran their customer loyalty schemes. Ditto in travel. And leisure.

In fact, wherever there is data and lots of customer transactions to be found, these teams are sprouting fast - assuming a mysterious, guru-like status as they go, thanks to their knowledge of the dark arts of data manipulation and a dash of AI. But in creative-driven industries like publishing, we’ve been less ready to embrace these people and their talents. This scene from Mad Men (after the arrival of their first IBM mainframe) sums up our distrust of them quite nicely.

In progressive environments, however, these people are becoming architects who - if they’re good at their job - have the wherewithal to design the product and service offerings of the future, because the data that they farm has the potential to drive amazing new (automated) things: stuff like chatbots, mass personalisation strategies, product recommendation engines, customer service interventions, and more. All of which ought to be incredibly interesting to publishers.

Over the last couple of years I've had a number of existential conversations with publishers about this movement. As mentioned, in stage #2, they tend to revolve around the gorilla in the room, Amazon. In a digital-first world, where sales are massively skewed to online, Amazon owns all the customer data - whilst publishers own none of it (or very little, when they are not selling direct).

So, how can publishers innovate in similar, data-driven ways to retailers/etailers, banks, insurance companies and travel firms?

Know your nodes

One way of answering this question is to look a bit further afield. An assessment of Facebook is a great way to think differently about where data comes from and the value that it provides.

Facebook is, of course, a giant publisher (an aggregator of your best content, alongide many videos of kittens), which means it's in the advertising business. But strip away the newsfeed, the nice little media formats, the apps and the accompanying services like Instagram, and what have you got?

Data. A lot of data.

Facebook is one of the world's biggest data companies, based upon one of the world's simplest ever data schemas. It’s a platform that knows you and what you like, in a very, very precise way - which is a very, very valuable commodity for advertisers of all stripes.

The same source data can also be a very valuable backbone for a retail channel, a payment gateway, a broadcast media network (film or TV), or - dare I say it - a publisher. Since Facebook knows what we like, it knows what we want to watch, how we want to buy stuff, what products we want the most and what we love to read. Facebook is an incredibly fertile environment for the disruption of a host of established sectors; thanks to its core data asset.

Facebook's main trading 'node' is its users. The company uses some very cool technology under the bonnet to do innovative things with it - ranging from graph databases to advanced machine learning - but at the end of the day, that's it. Facebook is successful because it’s very good at figuring out new ways to deliver services that exploit the value of its users - you and me - and the things we all ‘Like.’ (Just as Amazon does with its understanding of our buying patterns.)

Hence, the question for publishers that want to exploit a data-driven strategy is: ‘what's your node'? What data-based asset will your future products and services be trading on in the future?

Will it be your customer data? (OK, that's a tough one.) Will it be market or product data? (Since you know what's selling where.) Will it be content-based data? (Since you alone own all the stuff that fills the pages.) Will it be author-related data? (Since you own the relationship with these incredibly important and popular people.) ...Or will it be something else or a mixture of the above?

And once you've got that bit down, the next question is: how are you going to farm it?

Do you have the teams and skills to handle data in a valuable way? Do you have the tools and resources to do the job? Is your business integrated enough to use new data effectively in different processes? Are your people committed to using data in the first place, and leaving some of the creative, gut-level guesswork at home? And, are you ready to make product development decisions with the data you create?

Get involved in the discussion

I reckon the publishers that will thrive will be those that tackle these questions the best. Those that don't will struggle, as 'node-savvy' start ups and competitors (and Facebook!?) exploit new data opportunities to deliver publisher content (books, articles, video, imagery, etc) in compelling new ways.

Please do come along to the conference on December 1st, stop by and pitch in at the panel discussion. Hopefully we'll be able to get some good answers to the above - with your help.