The size of topics

This article appears in the issue June 2005 [Volume 14, Issue 6]

Nicholson Baker has a lovely essay in his book "The Size of Thoughts" about, well, the size of thoughts. He asserts as if it's obvious that most are about three feet tall. Now, that's absurd of course, but Baker's essay works because it lets him talk about the thoughts that are exceptionally small and the ones that are way bigger than any of us.

When it comes to the size of topics, the absurdity drops away entirely. We can measure how many topics there are and how big they are by looking at some objective measures. What we'll see, I believe, is that topics are getting smaller and more local. And since topics are what knowledge is about, knowledge itself must be changing as well.

So, here are some obvious sources of topic sizes as well as topic numbers:

The Encyclopedia Britannica has about 65,000 topics spread across 32 volumes, for a total of 44,000,000 words. So now we know that that the average size of a topic is 676 words. This is deceiving, though, because the Britannica is famous for running articles up to 10 times longer than previous encyclopedias.

The Library of Congress organizes its 17 million books into 285,000 subject headings. The Library is quite generous with its topics: If a book comes in that doesn't fit any of the existing ones, all it takes is a simple proposal and a committee vote to add a new one. It adds about 8,000 a year.

Wikipedia has 500,000 topics in English and about 250,000 user-created tags. Average length (using year-old figures): 294 words ... less than half the size of Britannica articles.

The size and number of Wikipedia topics are set bottom-up. Anyone can contribute or edit an article. So, topics there assume their natural size--natural, that is, to the set of folks comfortable with creating and editing articles online.

The question of who gets to decide what constitutes a topic is crucial. This is nowhere clearer than with the media. What constitutes news is determined by editors who meet daily to decide what to include, and how to rank order what they include. Vietnam veteran spits on Jane Fonda? News! New technology enables Oxford scientists to read the writing on 400,000 ancient documents discovered in Egypt? Maybe, so long as there's room left over from the coverage of the latest celebrity pedophilia trial.

The Britannica's choice of topics is governed by loftier goals. It's limited only by the cost of paper and its own sense of dignity. Put these together and we get topics of a particular size: There has to be enough in them to warrant the bold-faced heading, but not so much as to make them metaphysical. And, course, the selection of topics is guided by the culture in which the Britannica's embedded: No one would expect the Encyclopedia Iranica, for example, to cover the same topics the same way.

But what happens when you don't have an authority in charge of what constitutes a topic worth the electrons, paper or breath?

Wikipedia shows that we quickly head toward an order of magnitude increase in the number of included topics. They become far more granular and local. With the famous "Heavy Metal Umlaut" page, Wikipedia's idea of a topic heads toward the "Didja ever notice that ... " level of standup comedy. Not that there's anything wrong with that.

But Wikipedia is still a controlled environment. Control has passed to norms that regulate the back-and-forth revision process. That's a lot different than the list of authorities at the back of The Britannica, but it's still a form of control. How big would topics be if no one controlled them?

We'll never know because topics always have some element of control. At least they're always shaped socially. It's not a topic until more than one person is talking about it. Otherwise it's just a rant or a ramble. A topic has to have a little persistence, and that persistence comes through and across social interaction.

Here's why this should matter to people who care about knowledge management. Topics are how we cluster what matters to that. KM is admittedly easier if the topics are fewer, broader and contained. But knowledge grows better when topics are enabled to grow to their most natural size and assume their most natural shape: small, loose-edged and occasionally ridiculous.

David Weinberger edits "The Journal of the Hyperlinked Organization" , e-mail

