Data-Dancing: How big is self-publishing?

Data-Dancing: How big is self-publishing?

'Shock and awe.'

I'm here with the same question asked by my colleague, Bookseller editor Philip Jones in his companion piece to this one: How big is the market for self-published titles? 

"The question is a simple one for which there is no simple answer," Jones writes. "In fact, there are lots of complicated answers."

We want your answers, complications and all. To that end, we've launched a short survey for your input to, as Jones puts it, "find out what the hive-mind thinks." Results are to come on Tuesday (16th June), so don't dally, let us hear from you. Jones:

There are two things we will learn from this. The first is whether we can — between us — come up with a consensus both for the UK market and the much bigger one in the States. The second is whether we can come up with a methodology: do we look at value and volume? Do we include print? Do we assume Amazon is the dominant retailer, and if so do we assume that its hourly updated bestselling charts are an accurate reflection of sales? Or are they algorithmically challenged — as many suspect?

My sense is that the numbers will have wide variances, as will the jottings behind them. But I also think the size of the marketplace will shock and awe. That is why I am doing this.

Also see How big is self-publishing now? An agent's view from Toby Mundy.

And if you'll tango with us Friday (12th June) for our weekly #FutureChat, you'll find another chance to join the "hive-mind" of our digital publishing community here at The Bookseller's The FutureBook and weigh in on the perceived size and value of the self-publishing market.

This story was written as the walkup to our #FutureChat for Friday 12 June. Join us each Friday at 4:00 p.m. London (BST), 3:00 p.m. GMT, 5:00 p.m. Rome (CEST), 11:00 a.m. New York (ET), 10:00 a.m. Chicago (CT), 9:00 a.m. Denver (MT), 8:00 a.m. Los Angeles (PT), 5:00 a.m. Honolulu (HAST).

Not that other parts of this business are easy to measure. In his recent New Insights on eBook Unit Sales from Nielsen, Michael Cader at Publishers Lunch had a couple of explanatory paragraphs about how the traditional publishing market action is reported. You may want to hold onto something in case of vertigo while reading this:

PubTrack Digital reports actual unit ebook sales, in the same way that Bookscan reports actual unit print book sales. While Bookscan reports weekly — getting sales data direct from retailers — PubTrack Digital reports monthly, and gets sales on a delayed basis from participating publishers, since retailers have been unwilling to share their ebook sales. So PTD runs on a three-month delay, as publishers reconcile their ebook sales with accounts.

Bookscan is incomplete for every title's sales in that it covers the reporting sales outlets only, but it includes every title sold somewhere at retail. PTD is complete for each title — the publishers report all their ebook sales at all of their retailer accounts  — but incomplete for the business as a whole, since it only represents reporting publishers. 25 publishers participate in PTD right now, but that group includes all of the largest trade publishers. PTD does include some publishers who do not currently report sales to the AAP (F+W, Lonely Planet, Sourcebooks and Time, Inc. -- plus religious publishers B&H and Baker). Primarily, PTD is missing most of the distributed clients reported to the AAP by Brookings, the ECPA, Ingram, Perseus, Random House, S&S, and University of Chicago Press.

I trust you've committed that to memory.

Suffice it to say, easy and clear metrics are not the book-publishing world's bragging point. And Jones' article sets out the key elements that make it impossible to get a verified quantification of the self-publishing side of the business. In essence, as he writes:

The reason self-published titles are not tracked is that they are predominantly sold digitally, and no e-book retailer that matters releases its sales data to Nielsen (or another provider). The fact that Amazon uses its own Standard Identification Number (ASIN)—which like the old ISBN is a 10-digit number—is immaterial to the larger issue of having no data. Could Nielsen accommodate the ASIN within its system if Amazon chose to supply its data that way: of course it could.

He adds, helpfully, that the perennial and worthy debate about "why Amazon does not help indie writers by making them visible in places outside of Amazon" is not what we're on about this time. 

No, what we're setting out to do here is to get our heads around the question, if not a definitive answer, of "how big is this market now?"

And to that end, as Jones says, it's my brief to collect some input on the US market, the largest and most mature of the self-publishing realms to date. 

What we learn very quickly in this exercise is that, as in high-grade tango, the responses you get contain some push and some pull, always. Every response comes with some basic viewpoint, perspective, stance. And that's fine. You just want to keep it in mind. To mix my similes (most elephants don't tango well), we're all going to get drunk in a hurry if we take a shot every time somebody mentions those blind men.

Or, as a friend has said recently, many people in the business will tell that your concept of the self-publishing market's size is wrong, wrong, wrong — but they have no better numbers to offer, themselves.

I'm grateful to everyone who has been in touch with me in response to my inquiries. And rather than do them the disservice of trying to present a tangoing pachyderm cobbled together from all of them (cubism, it's been done), I'm going to offer them separately to you. This gives each of my US-market responders a chance to make his or her point, and have it stand on its own.

Let's start with one of the best-known numbers-dancing teams in the US.

On the dance floor now: Hugh Howey and Data Guy

The author Hugh Howey (pictured) and his still-unnamed "Data Guy" (um, not pictured) who produce the quarterly reports may not have numbers that everyone agrees with, but, they certainly do have numbers. This, in itself, is more than many other pundits can claim. In terms of that perspective/viewpoint element, theirs is tied to a mission of demonstrating for authors how viable self-publishing may be, financially, when compared to traditional publishing. This is an intent that Howey and Monsieur Guy are completely open about. Their project analyses sales data that is "scraped" electronically from the sales pages of Amazon ebook bestsellers  (200,000 titles for the most recent report in May).

And both of them were kind enough to jump in when I asked what they might have in terms of our question this week. 

Question: What is your estimate for how big the US market for self-published books is (value and volume)?

Data Guy: US Indie sales in 2014:

  • 185 million ebooks
  • 9 million print books
  • audiobooks TBD (tackling this soon)

In other words:

  • 14% of the books of any format sold in the US in 2014 were indie (ebooks).

For adult fiction, the indie share (127 million ebooks) is proportionately larger:

  • 24% of all adult fiction books of any format sold in the US in 2014 were indie (ebooks).

In dollar terms, US Indie sales in 2014 were:

  • $459 million in consumer spending on indie ebooks.
  • $294 million in indie author earnings (with online retailers earning $165 million on those indie sales).

Question: Can you detail (in brief) how you came to these numbers?

Data Guy: The answers come from merging the data from AAP, PubTrack, and AuthorEarnings. It's the classic parable of the three blind men and the elephant [drink]: each can only describe the part they are touching. But by pooling their knowledge, they can "see" the whole beast.
The AuthorEarnings [AE] methodology gives us highly detailed visibility into the relative ebook market shares (in both units and dollars) for each sector of publishing (Big 5, Small/Medium Publisher, Indie Published, and Amazon Pub Imprints).
The AAP and Nielsen PubTrack, on the other hand, provide aggregate absolute numbers... but for only one and a half of those sectors (Big 5 + a subset of Small/Medium Publisher).
Thus by tagging AAP & PubTrack-participating publishers in our AE data, I was able to recalibrate our AE rank-to-sales curve to match the AAP’s monthly dollar sales and PubTrack's 2014 total units. That yielded equally accurate absolute numbers for the other, non-AAP-measured ebook categories: non-participating Small/Medium Publishers, Indie Published books, and those from Amazon Pub Imprints.

Question: What in your view is the main hindrance to working out the market size, and how would you suggest resolving that?

Data Guy:  As someone who has already worked out the true market size and how it divides up to a reasonably high level of accuracy using publicly available data, I'm not sure you're asking the right question here. 
To my thinking, the better question is: Cui bono? ["For whose benefit?"]
For Amazon, Barnes & Noble, Apple, and Google, releasing their ebook sales numbers would make poor business sense indeed. But Porter, any large industry player with sufficient tech savvy and data expertise -- or the ability to hire the same -- could do what I did. There's a reason they haven't.

Hugh Howey: Data Guy's last sentence here is the nail in the coffin, really. Our data is made freely available. And it's data that anyone could grab if they wanted to. A major publisher spends a lot more money promoting a single title than they would need to spend to find out relative market share among the Big 5, for the lost indie market, and for small presses. I think the people who would make this decision know what they would find, and that's why they don't do it. If I suspect I have terminal cancer, I'm not going to go get an MRI. I already know. And I'd rather not know, know.

Question: Does anything you guys do suggest a ballpark number on how many indie titles might be introduced into the US market on an annual basis?

Hugh Howey: Back in February 2014, when we started AuthorEarnings, the number of Amazon Kindle titles was roughly 2.85 million.
Today, that number is 3,675,096. So the number of titles has increased by 825,000 over 16 months. That’s around 600,000 titles a year. It’s a safe bet that at least half of those are indie, and probably more like three quarters of them.
So I’d say between 300,000 and 450,000 new indie ebook titles a year.
In June 2012, I see from a Quora post that there were 1,335,660 titles in the Kindle store back then. That gives us a three-year average of around 800,000 titles a year, so if anything, the release rate of new titles is slowing down… most likely because everyone was bringing their entire backlist out on Kindle in 2012 and 2013.

Quick turns from Jane Friedman and Beat Barblan

We have some more extensive and involved thoughts from several good people to come, and I'll offer those to you in a separate write-up. They're from folks "workin' the platforms" that serve as the main producing enablers of self-published work. 

To finish up this installment of our US-market-size input, however, I thought we'd pause for a quick cha-cha with two of our favorite folks, author-industry specialist Jane Friedman and Bowker Identifier Services director, Beat Barblan.

No fool she, Friedman, who with Harry Bingham mounted a fine author survey led by The Bookseller earlier this year, goes right to those guys at the zoo — "It's the old parable about blind men trying to describe an elephant" (drink) — and she then points, as must we all, to that other large animal in the room:

A great majority of self-pub activity is on Amazon, which doesn't release official figures and doesn't require ISBNs. There's our big challenge. Until we know what's happening there, it's hard to know the overall volume of self-published work on the market. It's safe to say it's in the millions of titles at this point.

And Barblan, always gracious on the hot seat, has a similarly quick and experienced response (somehow without mentioning the big gray animal): 

I can give you quick, but not comprehensive answers. All I have to go by is the number of ISBNs issued for all those titles we would consider as self-published as per our annual report. Based on that, I would venture a guess (and that’s exactly what it is, since I don’t have exact numbers) of around 600,000 to 700,000 [self-published] titles for 2015. This takes into account the last numbers of ISBNs we have issued (mostly to organizations that work with self-publishers, e.g. SmashWords), and the fact that while we don’t capture every title, for obvious reasons, there are also self-published titles that have more than one ISBN due to multiple formats. 
I wouldn’t venture a guess as to the value of the market. There are others much better informed and who track sales, which we don’t. 
The biggest hindrance to working out the market size comes down to the unavailability of a comprehensive sales data. 

Glide over to FutureChat

While Nielsen's great people Stateside tell me that they're "working on a methodology," you'll find that Jones, in his write on the UK vantage point finds useful material, of course, in Nielsen's work there. There may be a tendency by Nielsen in some cases to combine Amazon self-publishing estimates with Amazon Publishing numbers. Amazon Publishing is in no way self-publishing, of course, as I'm sure Nielsen's folks know. APub is Seattle's traditional publishing house with some 14 imprints, a standout being the translation leader AmazonCrossing.

Jones is careful to point out, "I am sticking to self-published/KDP (and others) titles." Quite right. Jones concludes that £70 million (about $108 million) "feels about right" for a valuation of UK self-publishing. 

How does that sound to you? Don't forget to weigh in on our survey. And see you in #FutureChat.

Join us each Friday at 4:00 p.m. London (BST), 3:00 p.m. GMT, 5:00 p.m. Rome (CEST), 11:00 a.m. New York (ET), 10:00 a.m. Chicago (CT), 9:00 a.m. Denver (MT), 8:00 a.m. Los Angeles (PT), 5:00 a.m. Honolulu (HAST).

Main image - Shutterstock: John Wollwerth

Elephant images - Pixabay: e-smile, Kyzlikova, HBieser