Folksonomy Folktales 2010
During a recent project, I had occasion to review the latest writings about folksonomies in general, and comparing folksonomies and taxonomies in particular. I’ve been a bit leery of a lot of the claims for folksonomies, but wanted to see what, if any, new ideas there might be. What I found was, on one hand, some interesting experiments in combining taxonomies and folksonomies, and on the other hand, a whole lot of very enthusiastic writings about why folksonomies are better than taxonomies. What I didn’t find was a very good discussion of the relative strengths and weaknesses of the two. Instead rather there was article after article repeating the same myths, folktales, and misconceptions.
So, what I’d like to do now is take a look at some of these myths, folktales, and misconceptions and try to dig a bit deeper and ultimately see if we can’t come up with a more realistic view of both folksonomies and taxonomies.
Before we get into specific myths and misconceptions, I’d like to first take a look at a fundamental flaw in the vast majority of articles on folksonomies and taxonomies which is the almost universal use of the Dewey Decimal system (or Library of Congress Subject Headings) as the example taxonomy. Using the Dewey Decimal system as your example taxonomy when trying to discuss the pro’s and con’s of taxonomy and folksonomy says to me that either you have no understanding of taxonomy creation and use in today’s world or you are just so set on showing the superiority of folksonomies that you have set up a rather silly strawman to then gleefully knock the stuffing out of.
It’s like if the question you are exploring is the relative merits of jet skis versus boats and you pick the latest model of Jet ski and talk about how great it is, how nimble, how quick, how cheap, and they swarm all over the place and are really fun. And now for boats, you pick as your example – the Titanic. It’s really big and cumbersome and made of brittle steel held together with bad rivets and it costs too much and it’s too hard to build and it’s slow and hard to steer and runs into icebergs and kills lots of people. OK, so jet skis are better than boats – or jet skiis have all these great characteristics but yeah, if you can afford it the Titantic has more comfortable beds.
But wait, sail boats are boats too – and they are much smaller, cheaper, easier to build, and lots of fun. And you know what – there are lots of taxonomies that are smaller than the Dewey decimal system, easier to construct and use, less rigid, have built in revision procedures and user input capabilities, and generally don’t suffer from all those “characteristics” of taxonomies that folksonomy advocates love to list.
So think about this — if the Dewey Decimal system is the only example of a taxonomy you can think of, maybe you shouldn’t be writing about taxonomy and folksonomy. Or at least do some more research into the kind of smaller, more flexible, more responsive taxonomies that are being developed. There is more to taxonomy than the Dewey Decimal System.
Another overall impression I got from a review of new and old articles on folksonomies is that most of the articles are guilty of a rather massive overhype showing a great deal of enthusiasm but not so much careful thought. For example let’s take a look at the opening of an often cited article:
“The Hive Mind: Folksonomies and User-Based Tagging"
by Ellyssa Kroski
There is a revolution happening on the Internet that is alive and building momentum with each passing tag. With the advent of social software and Web 2.0, we usher in a new era of Internet order. One in which the user has the power to effect their own online experience, and contribute to others’. Today, users are adding metadata and using tags to organize their own digital collections, categorize the content of others and build bottom-up classification systems. The wisdom of crowds, the hive mind, and the collective intelligence are doing what heretofore only expert catalogers, information architects and website authors have done. They are categorizing and organizing the Internet and determining the user experience, and it’s working. No longer do the experts have the monopoly on this domain; in this new age users have been empowered to determine their own cataloging needs. Metadata is now in the realm of the Everyman."
I have to admit that my first reaction was, “Oh no, not another revolution! Didn’t we just have one last year and a couple the year before?” Maybe I’m getting jaded, but I really think that we should be a bit more careful to not cheapen a very good word. The printing press was a revolution. The Industrial Revolution was a revolution. The Internet is an ongoing revolution. But Folksonomies? I think not.
Note: In the following discussion we will distinguish folksonomy from user-generated tags. The focus will be on folksonomy because that is what is mostly being written about and it has what many writers believe is the essential characteristics of social feedback. One reason for this is that user-generated tags are not tied so completely to the idea of some sort of order emerging from people simply seeing how others are tagging.
Aside from the revolutionary fervor, this quote also exemplifies several of the standard general folktales about folksonomies.
Folktale One: Folksonomies are examples of the wisdom of crowds.
Actually folksonomies are the exact opposite of the wisdom of crowds. If you read James Surowiecki’s book, The Wisdom of Crowds, the key characteristic of situations in which you get a wisdom of crowds effect is that no one can be aware of what anyone else is doing. The reason that a crowd of amateurs can guess the weight of a bull better than an expert is that every guess is completely independent of what the crowd is doing. If you publish the guesses as they are being made, what you get is not the wisdom of crowds, but the madness of crowds – the bandwagon effect. Which means that folksonomies that publish tag clouds of popular tags do not exhibit a wisdom of crowds effect.
Of course, you’re free to use the phrase, Wisdom of Crowds, to mean something else – perhaps that throwing a lot of people at a problem, regardless of how you set it up, will inevitably lead to a good outcome, but in that case, I would suggest that you read some history starting with Tulip Mania and going through to the latest real estate boom and bust, and for good measure, throw in the book, The Cult of the Amateur by Andrew Keen for a counter view.
Folktale Two: Folksonomies are building bottom-up classification systems.
First of all, folksonomies are not a classification system, they are an unordered, flat set of keywords that are ranked by popularity. Ranking words by their popularity can tell you a great deal about how groups of people are thinking and that information can be extremely useful, but it does not tell you much of anything about the relationships between words or concepts. In other words, there is no “onomy” in folksonomy.
Folktale Three: Folksomomies are Working
This one is somewhat in the eye of the beholder, but it seems to me that the claims for success for folksonomies are vastly overstated. Yes, there was a flurry of activity when Delicious and Flickr first came out and that a number of similar sites sprang up on the internet. However, the growth of new sites seems to have hit a plateau and a closer examination reveals that the number of people actually adding tags remains pretty small. For more on this question, take a look at the sections on the limits of folksonomies.
Also, there are two parts to “working”. First, folksonomies as tagging discussed above. But there is a second sense, which is about using folksonomies for actually finding information and here the answer is clearer – they are not all that powerful a mechanism for finding information. Browsing for like-minded people and using folksonomies for serendipitous browsing is fun and occasionally useful, but it represents a very small percentage of search behavior.
Folktale Four: Metadata works best when it is free – in the realm of Everyman
This seems to me to be basically pure ideology and expresses more of a cultural and political philosophy than an actual claim (See the section Why the Fuss later in this article). I haven’t seen any evidence for this claim and the experience of information architects and librarians asking people to tag documents in an enterprise environment strongly suggests the opposite. If you have to choose between metadata tags generated by users and authors without any guidance from a taxonomy versus those created by IA’s and librarians from a controlled vocabulary or taxonomy, the answers were pretty clear that the latter produced much better results.
On the other hand, it was always a struggle to get people to add metadata at all and if we again restrict our attention to the actual act of tagging, then yes, folksonomies seem to have an advantage in that it is easier to get some people to just think of a tag off the top of their head than to select a value from a complex taxonomy. But there are two caveats. First, as we see from the number of people tagging, folksonomies don’t give you the kind of coverage we were looking for outside of general social bookmarking sites, especially within the firewall of enterprises – that is, getting everyone to tag. So folksonomies work better for getting some people to tag, but not for getting everyone to tag. This is another example of the difference between the Internet where getting anyone to tag is a plus and the enterprise/intranet where having only some documents tagged is not an answer. I’d like to see more research on just how many people are tagging at the different sites and who is doing the tagging.
Also, while I will grant that it is probably easier to think of a tag than select from a complex taxonomy, I’m not so sure that it is easier than selecting from a simple taxonomy. I’d like to see a lot more study on that one, something that goes beyond the visceral dislike of taxonomies by a few well known authors and the use of our old strawman, the Dewey Decimal System.
Let’s shift from our general folktales to more specific claims and myths about taxonomies and folksonomies. To start let’s look at another frequently cited article on folksonomies that is a good source for the standard myths about the drawbacks of taxonomies. This same list appears in other articles, so I hope I’m not guilty of erecting a strawman to knock down. Many of these “drawbacks” are also found in one form or another in one of the most often cited articles in this area, Clay Shirky’s Ontology is Overrated: Categories, Links, and Tags
Folksonomies: power to the people
Drawbacks of hierarchical schemes (taxonomies)
1. Items do not always fit exactly inside one and only one category.
2. Hierarchies are rigid, conservative, and centralized. In a word, inflexible.
3. Hierarchical classifications are influenced by the cataloguer’s view of the world, and, as a consequence, are affected by subjectivity and cultural bias.
4. Rigid hierarchical classification schemes cannot easily keep up with an increasing and evolving corpus of items.
5. Hierarchical classifications are costly, complex systems requiring expert cataloguers to guess the users’ way of thinking and vocabulary (mind reading)
6. Hierarchies require predictions on the future to be stable over time (fortune telling)
7. Hierarchies tend to establish only one consistent authoritative structured vision. This implies a loss of precision, erases difference of expression, and does not take into account the variety of user needs and views.
8. Hierarchies need expert or trained users to be applied consistently
This is a well organized and often repeated list of taxonomy drawbacks. There is just one problem: Pretty much every single one of these is wrong, misleading or overstated, or a known issue for which taxonomists have worked out methods to overcome over years of practice.