BitLit has data you thought you'd lost -- on its 'shelfies'

BitLit has data you thought you'd lost -- on its 'shelfies'

"They really are the Big Five. They're not just saying that. They really are." 

Peter Hudson cracks up as he says this. He's the co-founder with Marius Muja of the still-young Vancouver-based start-up BitLit. As we report today at The Bookseller, the company has just announced a major new partnership with Elsevier for some 5,000 science and technology books. As in all of the 200 or so publisher-partnerships BitLit has established so far, the consumer creates proof that he or she owns the print edition of a book by writing his or her name on the copyright page and sending a shot of that to Vancouver. Once verified, BitLit then can provide an ebook copy of the book -- free in some cases, at a discount in others, it's the publisher's choice.

In many cases, Elsevier being the latest, publishers are choosing to make their ebooks available from BitLit DRM-free, too. "Not even watermarking," as Hudson notes about Elsevier. But if the publisher requests it, BitLit will also supply any grade or type of DRM to that publisher's ebooks.

What he's telling me in this conversation is a good bit more revelatory to publishers than the fact that the Big Five "really are the Big Five." But the point is demonstrated when he and his team analyze the spines of bookshelves -- from tens of thousands of "shelfies." Nothing to do with shellfish, a "shelfie" is a "selfie" for your books. You get out your smartphone and snap a shot of 25 book spines or so as they exist on your bookshelf. Then you send it to BitLit.

Remember, tens of thousands of those images of 25 books or so at a snap. Guess what that is: data.

Think about this from the consumer angle, and it sounds like a great way to find out whether BitLit has a deal with a publisher of any of your shelfie's books. Vancouver gets back to you with an analysis of your shelf, nice to have -- if you're lucky, something on it might be on offer as a discounted ebook from a partner-publisher.

But think about this from the publisher's viewpoint, and it dawns on you that Hudson and company are able to peer into homes this way and tell just what books are out there. It's good news for Hudson because he can then contact publishers he'd like to have partner with BitLit and tell them that he's spotting multiple copies of one of their books, and maybe there's a market among those shelfie-snapping readers for discounted ebook copies of those print books.

Long after receipts and point-of-sale data was lost on a print-book sale, some 20 to 25 books are standing in each shelfie. Should a publisher want to woo that consumer with an ebook offer, sales data awaits -- a publisher can actually begin to recapture lost data on who owns what book and where.

"The shelfie also allows us at BitLit," Hudson says, "to push you a notification when we sign a new publisher." 

So let's say my shelfie shows BitLit that I have a print copy of Chuck Wendig's Mockingbird (2012, Angry Robot). At the point BitLit sets up its partnership with Angry Robot as one of its ebook-bundling partner-publishers (and Angry Robot is among the 200), an email arrives in my inbox from Vancouver announcing that I now have a chance to get a discounted or free ebook edition of Mockingbird. Angry Robot gets the data on me, one of its consumers holding a print copy of the book. I get an ebook edition of the Wendig novel. It's win-win for publisher and reader.

And no, the human staff isn't sitting there studying your image. 

Me being me, of course, I accuse Hudson during our conversation of really having all his poor staffers hunkered over these shelfies coming in, trying to make lists of the book spines they see on a shelf. 

"No, no," he says, and thank God he has a sense of humor. "It is technology that does this.

"The way it works is the divide-and-conquer approach. The first thing we do is separate the books. The first pass we make over it is segmenting. We try to find the individual book spines. Once we find the individual book spines, we do some pretty advanced optical character recognition" on the spines to try to let the machine parse out letters of the alphabet. "Then we mince up the characters into words. And then we match those against the Bowker Books in Print database."

This elastic optical-recognition search at BitLit, in fact, not only has caught the attention of Bowker, Hudson says, but also of an MIT professor in machine intelligence.

Sometimes, the system doesn't work. And not just because a spine is old and worn. Hudson scans his own shelf as we talk and finds Christian Rudder's recent Dataclysm: Who We Are (When We Think No One Is Looking) from Crown - interesting shelf old Hudson is keeping, huh? -- and he points out that the title fonts have a kind of graphic overlay that could make it difficult for "machine reading."

In such a case, "when we just can't get enough letters off the book,"" the spine is outsourced "ironically," he volunteers, "to Amazon's Mechanical Turk." Workers in that system are sent a dozen or so shelfies at a time, including a test case or two to be sure their text recognition work is accurate. 

 What's more, there's a site at which Mechanical Turk workers review companies that hire them. "The Turkers who review our 'HITs' [human intelligence tasks] say that they really love working on our HITs because most of them are big bookworms. One of them posted that she's actually gone out and bought books because she came across them while doing our HITSs" -- working out what the text on a book spine in a shelfie says.

Bundling: Trundling along nicely, thank you

We've been in touch with Hudson several times in the development of his and Muja's company, founded last year. He piloted our own #PorterMeets series with us a little over a year ago. (This was when we learned that the ubiquitous Richard Nash -- just seen, as was Hudson, at our FutureBook Conference 2014 -- was working with BitLIt in an advisory capacity.) BitLit, in fact, was on our Best Start-Up shortlist for this year's FutureBook Innovation Awards. And my Bookseller colleague Sarah Shaffi wrote up Michael Serbinis' venture fund investment in the company in May. And, of course, we reported on BitLit's pilot with HarperCollins (US) on a very limited number of trade books in July.

Granted, many of the publishers BitLit has partnered with so far are small and don't arrive with major inventories of ebooks to bundle with print copies. Nevertheless, when preparing for this latest interview with Hudson, I jumped onto the BitLit site and couldn't find the page it once had displayed with participating publishers' logos on it. 

"That's because we have too many publishers to show that way," he tells me. A good problem to have. And the way now to see which publishers are working with BitLit is to go to the Books page and click the dropdown field for "Publisher" on the top nav-bar. You might be surprised. Two-hundred publishers makes an impressive list.

And getting back to some of the things that Hudson can tell when those shelfies come in from ebook-bundling fans, not only can he confirm that the most books on the aggregate shelf-space involved are from the Big Five but also such interesting points as the fact that F+W Media is number 67 -- out of some 2,000 publishers identified so far on shelfies. 

What does this tell Hudson? "Community. At all these conferences like FutureBook, you talk to publishers and they get all wound up about brands and imprints. But the average reader just thinks about authors and titles. The only time the consumers think about publishers is in a case like F+W where they've done so much about community around what they're doing" in such verticals as Writer's Digest.

"You think about other specific nonfiction houses -- 'That's an O'Reilly book, that's a Lonely Planet book, that's an Elsevier book.' This puts them in an incredible position when Amazon comes calling" for negotiations. If a community based publisher, Hudson says, doesn't like the terms the big retailer is offering, they can say, "Look, we have community here, we can sell direct."

And the genius of what the shelfies are showing, then, is that a publisher can go to BitLit, find out from the shelfies what print holdings are out there (with consumer privacy protections, of course) and then think about, as Hudson puts it, "Where do we want to build community with bundling -- it enables us to connect with readers we could never find before.

"Bundling has stumbled on this. Lots of people have started with point-of-purchase bundling. But when you separate the bundling transaction from the original print purchase, and allow it to happen years later, as we're doing, then you provide the bundling option to your customer when he needs it, when he's going on that field trip up to the Red Dog Mine and needs  his handbook" from Elsevier. For $89, that's a smoking deal. And Elsevier now knows that engineer's name and email address and that I'm an engineer in mining in Alaska." Data captured.

Bundling may prove most valuable, as it turns out, as a way to recover lost audiences in a peculiar way.

"And what we're seeing," Hudson says, "is that the publishers most interested in that data are the ones most likely to offer bundling ebooks at the best price. The publishers most intent on building community will be the ones offering to let us bundle those ebooks free or at high discounts. That's why we seem to have the forward-thinking in the industry. The industry leaders are signing up on this one."