Don't miss Data Summit, May 22-23. Learn about big data, AI, machine learning, cognitive computing, blockchain, and more.

The failure to attribute

This article appears in the issue May 2013, [Volume 22, Issue 5]

   Bookmark and Share

A friend of mine discovered recently that a Major Corporation has used his words in a few of its slide decks. The words aren't many—about 10—but they're so distinct that it's unlikely the company happened to come up with them independently. My friend wouldn't be bothered by this, but the slide doesn't attribute those words to him. Attribution matters a whole heck of a lot ... although in another sense, it really doesn't matter at all.

In this case, it probably matters to the corporation enough that they very likely would rapidly add my friend's name to the slide if they found out they were quoting him. It wouldn't cost them anything, and presumably the company is as ill-disposed to plagiarism as the rest of us are.

But that also shows the way in which attribution doesn't matter. I assume that someone in the company read my friend's words, copied them into an e-mail or some such, and as it got passed around, my friend's name got dropped from the quotation. The person who passed the quote around (in this hypothetical case) did so because she found value in the quote, not in my friend's name. The next person who passed it around may well have never heard of my friend, and so to her his name is just a meaningless string of letters. All the value is in the quote and none is in the name, at least to the people passing it around. It'd be different if the quote were from a name that added some heft or resonance to the quote: Abraham Lincoln, Winston Churchill, Homer Simpson.

Sense of fairness

So, perhaps we should just accept that our most notable words are likely to slip free of our name, and we should be OK with that. Or perhaps we should even be happy that we've managed to contribute to our culture.

Well, sure. On the other hand, it is irksome to hear your words coming out of someone else's mouth as if they were not yours. In fact, our sense of fairness about this is so strong that Creative Commons' licenses, which are designed to make it easier to share our work, nevertheless all insist that attribution be given. Even if we recognize that attribution cannot be perfectly maintained, we have a strong moral intuition that it ought to be.

In the Age of the Internet, the situation gets worse and better.

It gets worse because our words (and pictures and sounds, etc.) move more freely, and since they move because people find the content valuable (but not the attribution), they are likely to shed their attribution somewhere along the way.

This is true of facts as well as of quotes. Our system of knowledge works for us because it enables us to accept sources as authentic. So, we don't have to reopen the map every time we say that Massachusetts is on the east coast of the United States, and we certainly don't have to check the sources that the mapmakers drew upon. It's the fact that has value, not which authority we happened to have used. Thus, facts tend to escape from their attribution as well, especially in an environment as fluid as the Internet.

Fundamental asymmetry

But the Internet also improves upon the picture. One reason people don't cite a source that means nothing to them is that it interrupts what they're saying. If I want to use the phrase "Information wants to be free" in an ironic way in the course of a highly amusing article on pay walls, having to stick in "as Stewart Brand once said" may ruin the flow. But, thanks to the Web, I can hyperlink the phrase to the source, or at least to the Wikipedia article that discusses the sources.

More important, of course, the Net makes it incredibly easy to look up a quote or a fact. What would have taken a trip to the library now can be accomplished at the speed of typing. Of course, I have to be motivated to do so, and the truth is that most of the times I have used "Information wants to be free," I haven't bothered attributing it to Stewart Brand because it was in conversation, a tweet or some written context where it didn't seem to matter enough.

I expect that the fundamental asymmetry of citation will continue: We quote phrases and the like because we value them, whereas the name of the author almost always has no value to us. Therefore, informational entropy will tend to make our words unattributed.

Except for one possibility. Imagine that the tools we use to create text come with an attribution checker, just as they come with spell checkers. This service checks our text, and especially anything between quotation marks, and suggests hyperlinked attributions for us. It might also correct our casual misquotes, so that "Information just wants to be free" and "Information wants to free itself" would get green squiggles under them to flag them as inaccurate.

The only problem with this idea is that it would immediately be taken over by the copyright cartel that would want you to pay a nickel each time you use any phrase they deem within their purview. That would be a disaster, because if there's anything worse than rampant lack of attribution, it would be an algorithmic insistence upon it.

Search KMWorld