July 3, 2010
By David Weinberger Affiliate, Berkman Klein Center, & Researcher Harvard's metaLAB
Perspective on Knowledge

Waiting for the fluid book format

Books are complex. Let’s hope someday our standards live up to them.

We’d all like to have a way to move the contents of a book into an ebook and from there into a Web page and then into a display suitable for the tiniest of screens and then have it read itself aloud to someone with impaired vision and then have it automatically decompose into daily blog posts and then reassemble itself into a book, all without any loss of data or metadata. Of course, we’d all like that to be done with nothing but open source tools.

Well, not all of us share the dream. Some of the leading manufacturers of dedicated electronic reading devices view themselves as content companies, not as device manufacturers. They want to vertically integrate themselves so that they are the sole source for the books we read on their devices. They also want to make it difficult to move a book from one device to another, for fear we might share it, the way we share physical books. For if there is one thing we want the digital revolution to provide us, it is more restrictions on how we use what we buy than we have in the physical world. At least that’s what we want in Sarcasm World.

Unfortunately, books are so dense with meaning that it’s a hard dream to realize. Even when the content of the book is slight, the structure of books is complex, and the ways in which format reflects and manifests that structure are more complex still. For example, we expect font size to reflect the level of headings, although sometimes we use italics or underlines. And headings are the easy stuff. We interrupt our nice hierarchical outlines with illustrations, captions, footnotes, asides, links, tables of data and just about anything we can think of.

It gets harder. Some books care a lot about exactly how they’re laid out; designers spend many hours getting each piece in exactly the right place. Some books are all graphics. Some are comic books. Some are pop-up books, but we’ll skip those since there’s no conceivable way of replicating them on flat-screen monitors ... so far.

And what do we have? A Kindle that doesn’t even know that it’s OK to break hyphenated words across lines. An iPad that does a better job with the formatting and graphics but still doesn’t take advantage of the structure built into books. And we have an open standard, EPub, that has many strengths, but some weaknesses. EPub takes content in well-formatted HTML (XHTML, to be exact). It includes a manifest of book parts, but does not know much about those parts’ relations. It usefully separates the structure, content and formatting, letting a designer specify (in a CSS file) exactly how each element is to be formatted, but it leaves the layout up to the ebook’s decisions in a way that a PDF file would not. Sometimes that’s what we want, but sometimes it isn’t. The problem isn’t with EPub so much as with the inconsistent demands we make on our documents.

The big weakness in EPub is that it is so darn finicky. If you thought XHTML was fussy—make sure you close every door behind you, and don’t you dare butter your bread from the outside in!—you haven’t met its OCD cousin. Of course, I’m writing this having spent just about a full day trying to wrangle an uncomplicated book through its many hoops, including (and this to me is just too much) making sure that one of the various files is the first in the zip you ultimately produce. These are files with distinctive names and extensions that a compiler couldn’t miss. Aarggh!

Sigh. Thanks, I feel better now.

Even with an easier compiler, the problem of books is not going to be easy to solve. It may take multiple standard formats, just as for Web pages we sometimes want HTML, sometimes PDF, sometimes something more programmatic and data driven. But we won’t get any of these unless and until the makers of ebooks decide that the value of those devices is not that they connect us to a proprietary bookstore, but that they do a hell of a job displaying books and an even better job giving us the tools to navigate, understand, share and reuse what our books are about.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Save with Early Bird Pricing for KMWorld 2026!
Register NOW and join us November 16-19

Waiting for the fluid book format

Mining Business Knowledge From Unstructured Data

Checklist Report - Preparing for Agentic AI: KM Playbook

2026 State of KM & AI Report

More

Knowledge at Your Fingertips: Building Workflows with Embedded Intelligence

The AI Knowledge Maturity Model: Assessing Readiness and Measuring Progress

Closing the Knowledge Gap: Strategies to Deliver Answers at Scale

KM + RAG: Building Trustworthy, Context-Aware AI

More Webinars

Save with Early Bird Pricing for KMWorld 2026!Register NOW and join us November 16-19

Waiting for the fluid book format

Save with Early Bird Pricing for KMWorld 2026!
Register NOW and join us November 16-19