Libraries and copyfraud

For the last week, I’ve been exchanging emails with curators at the Huntington Library about their use policies for digital images. For the Darwin Day 2009 Main Page effort on Wikipedia, I’ve been putting together a list of portraits of Darwin. Although a number of websites have significant collections of Darwin images, there isn’t any single comprehensive collection. One interesting shot I came across is an 1881 photograph, possibly the last one before Darwin’s death, that was allegedly “rediscovered” in the mid-1990s when a copy was donated to the Huntington. Press releases and exhibition descriptions invite people to contact the Huntington to request images, so I requested the Darwin photo. The response I got was typical of how libraries and archives deal with digital copies of rare public domain material.

The Huntington quoted distribution fees for the digital files (different sizes, different prices), and also asked for specific descriptions of how the image would be used, so that the library could give explicit permission for each use. Had I wanted to use it for more than just publicity (e.g., in a publication) more fees would apply. Apparently the curators were not used to the kind of response they got back from me: I politely but forcefully called them out for abusing the public domain and called their policy of attempting to exert copyright control over a public domain image “unconscionable”.

In the exchange that followed, I tried to explain why the library has neither the moral nor legal right to pretend authority over the image (although, I pointed out, charging fees for distribution is fine, even if their fees are pretty steep). A Curatorial Assistant, and then a Curator, tried to explain to me that the Huntington actually has generous lending policies (you don’t “lend” a PD digital image, I replied), that while the original is PD, using the digital file is “fair use” that library has the right to enforce (fair use, by definition, only applies to copyrighted works, I replied), that having the physical copy entails the right to grant, or not, permission to use reproductions (see Bridgeman v. Corel, I replied), that other libraries and museums do the same thing (that doesn’t make it right, I replied), that big corporations might use it without giving the library a cut if they didn’t claim rights (nevertheless, claiming such rights is called copyfraud and it’s a crime, I replied), and finally that I should contact the Yale libraries and museums and see if they do things any differently (a return to the earlier “everyone else does it” argument with a pinch of ad hominem for good measure, to which I see no point in replying).

Unfortunately, the Curator is right that copyfraud is standard operating procedure for libraries and archives. Still, I think it’s productive to point out the problem each time one encounters it; sooner or later, these institutions will start to get with the program.

As an aside, the copyright status of this image is rather convoluted. The original is from 1881. The photographer, Herbert Rose Barraud, died in 1896. The version shown here (originally; now lost) is a postcard from 1908 or soon after, making it unquestionably public domain. It comes from the delightful site Darwiniana, a catalog of the reproductions and reinterpretations of Darwin’s image that proliferated in the wake of his spreading fame. Apparently, when the image was “rediscovered” in a donation to the Huntington, they thought it had never been published and was one of but two copies; a short article about the photograph appeared in Scientific American in 1995. Had it actually never been published until then, it would arguably be under copyright until 2047 because of the awful Copyright Act of 1976. I say “arguably” because of the vague definition of “publish” and the rules for copyright transfer (“transfer of ownership of any material object that embodies a protected work does not of itself convey any rights in the copyright”) combined with the fact that another copy exists would seem to indicate that, at the very least, the Huntington has no place claming copyright. Paradoxically, publishing it for the first time in 1995 would have extended the copyright to 2047 but would have made the Huntington and/or Scientific American into violators of the copyright of whoever actually owned it (which would likely be indeterminable). But if it had remained unpublished, it would be public domain. I’m still unclear about whether it would have been public domain before 2002, when the perpetual copyright window of the 1976 law closed.

UPDATE – My thanks to the others who’ve linked to and discussed this post:

Wikipedia’s search engine dominance = informational homogeneity?

Nicholas Carr (of “Is Google Making Us Stupid?” fame) is a consistent source of thought-provoking but (in my view) off-base critiques of the information age in general and Wikipedia in particular. He has an interesting post on the Britannica Blog, “All Hail the Information Triuvirate“. This coincides with Britannica’s roll out of new features to invite readers to suggest improvements, and some of the usual impotent snipes from Robert McHenry and other Britannica editors. Wikipedia gets 97% of all encyclopedia traffic on the Internet, so they have little to do but whine about the culture that let this happen and/or try to learn from Wikipedia’s success.

A favorite tactic of Wikipedia critics is to bemoan Wikipedia’s search engine success. Carr demonstrates Wikipedia’s dominance of results from the most popular search engine (Google), showing that for ten diverse searches that he first ran in August 2006, then again in December 2007, and again this month, Wikipedia articles rose from an average of placement of #4 to being the #1 hit for all ten searches. Carr “wonder[s] if the transformation of the Net from a radically heterogeneous information source to a radically homogeneous one is a good thing” and has difficulty imagining “that Wikipedia articles are actually the very best source of information for all of the many thousands of topics on which they now appear as the top Google search result.” But this rings shallow without examples (say, for any of his ten searches) of what single web pages would be better starting points.

The idea that the Net has become “radically homogeneous” just because Wikipedia is often the first Google hit is absurd. Wikipedia itself is far from homogeneous, and indeed its great strength is the way it brings together the good parts of many of the other sources of information on the Internet (and beyond). Carr’s implication seems to be that without Wikipedia (the “path of least resistance” for information delivery) search results would be better and finding valuable web content would be easier.

Carr seems to conceive of Wikipedia as a filter placed over Google that lets through only a homogeneous mediocrity. Wikipedia is better thought of as refined version of Google’s method of harnessing the heterogeneity of the Internet; where Google relies on a purely mechanical process, Wikipedia brings together sources with consideration of the individual topic at hand and human evaluation of the importance and reliability of each source.

Public weighs in on Flagged Revisions

Andrew Lih’s blog post “English Wikipedia ready for Flagged Revisions?” is a nice overview of the big news this week: it seems likely that some form of the Flagged Revisions extension is finally going to be used. For more details on the on-wiki discussion, this soon-to-be published Signpost article is a good place to start.

The comments on the NYT blog story on this development give a nice cross-section of public perceptions of Wikipedia among the Times’ audience, and their reactions to the possible change in the way the site works. Some choice quotes:

  • It’s a cesspool of misinformation and bias. Now that the Wikipedians are in charge, it will become even more useless as a reliable resource.

    Someone needs to be monitoring the Wikipedians. They are not to be trusted with the interpretation of things. -Wango

  • It’s a living, multidimensional document and I’m of the mind that it should be left the frak alone […] WIKIPEDIA NEEDS MISTAKES if it is to remain the vital document that it is today. Living things change, static dead things are perfect and immutable. -jengod
  • It’s not arrogant for wikipedia – or any source of authoritative information – to want to be right […] Grafitti on the wall may be instructive, but it does not make the wall more valuable or more purposeful. -Frank
  • Any edit beyond spelling, grammar and syntax, must be considered suspect, if done by a minor, an artist or any individual that does not have any expertise on the subject. -CGC
  • The real bad blunders are almost always corrected within hours (if the article is of no great interest) or minutes (if it is). So why bother? The true capital of Wikipedia is ALL of its contributors – and not just the “trustworthy” elite. Such measures will discourage new, fresh, motivated contributors, and in the long run dry out the project. -Oystein
  • It’s a standard fascist procedure to declare an outrage and then restrict freedoms under the guise of making things better for all. I’m not saying that’s what Wales is doing. Just saying that it sounds like a jack-booted tactic. -Kacoo
  • Is it possible that [the anons who ‘killed’ Kennedy and Byrd] weren’t vandals at all, but just people trying to be “that guy” who made the change to such an important entry. Who knows? -Light of Silver

Creative Commons on whitehouse.gov

The Obama transition team released most of its images and text under Creative Commons Attribution 3.0 License. Now this has carried over to the White House as well. Material produced by the federal government, of course, is public domain. But according to the copyright page on the new whitehouse.gov:

Except where otherwise noted, third-party content on this site is licensed under a Creative Commons Attribution 3.0 License. Visitors to this website agree to grant a non-exclusive, irrevocable, royalty-free license to the rest of the world for their submissions to Whitehouse.gov under the Creative Commons Attribution 3.0 License.

Such an endorsement can only be a good thing for the free culture movement.

Biting the newbies on Wikipedia

An example:

  1. New user creates article in good faith.
  2. Two minutes later, editor tags it for speedy deletion; article “does not indicate the importance or significance of the subject” (even though it does make a basic claim of notability).
  3. New user responds on talk page, explaining in more detail why the subject is significant by noting newspaper and magazine coverage.
  4. Administrator deletes article without either checking talk page or verifying speedy deletion rationale: “No indication that the article may meet guidelines for inclusion”.
  5. Newspaper publishes yet another piece of journalism that makes Wikipedia seem like a petty and unfriendly place and shows how overzealous deletion makes Wikipedia worse.

Will the Stanton usability grant stop Wikipedia community atrophy?

The recent Stanton Foundation grant to improve MediaWiki’s usability hopefully will lower the barrier for computer novices to get started on Wikipedia editing. This comes at an opportune time: we recently learned that the size of the Wikipedia community has not only stopped growing exponentially, it actually has been gradually shrinking since early 2007. The most likely causes of the decline include:

  • lack of “low-hanging fruit”
  • lack of new potential editors who are just discovering Wikipedia
  • Wikipedia’s scope gradually narrowing to mirror that of traditional encyclopedias (a.k.a., deletionism run amok)
  • Wikipedia’s occasionally expert-unfriendly culture that turns off those with the most to contribute
  • a Wikipedia culture that gives little priority (or even respect) to activities focused on the community itself rather than the encyclopedia
  • the natural decline in participation of early community members; according to Meatball Wiki, users of any online community generally say GoodBye after between 6 months and 3 years unless that community is connected to their offline lives

Usability improvements, it is hoped, will open editing opportunities to people who are scared off by the intimidating and sometimes overwhelming markup that appears when one clicks “edit”.

Whether or not this will halt or reverse the decline in editing activity on English Wikipedia is tied up with several conflicting currents of thought in the community. As Liam Wyatt and Andrew Lih have been pointing out in recent Wikipedia Weekly podcasts (66 and 68 are both very astute discussions), the standards for what is and is not valuable content have been shifting consistently towards the convential encyclopedia definition of valid topics. Quirky lists, small organizations that don’t meet the ever-harsher notability standards, obscure books and concepts, anything ScienceApologist finds to be an illegitimate invocation of scientific authority, anything deemed too ‘mere news’, and, increasingly, simply anything that wouldn’t be found in tradional encyclopedias–these are candidates for deletion.

The implications of deletion trends for community health are not entirely straightforward. Overzealous deletion leaves a sour taste in the mouths of many editors who have spent a lot of time adding the kinds of content that now gets deleted regularly. Some leave because of it, or lose their enthusiasm. On the other hand, a lot of what gets deleted is simply weak, unsourced content; removing it the article pool means that new editors will not base their own contributions on such bad examples. Deleting content on the borderline of notability, or better yet, downright notable and significant topics, also replenishes the supply of low-hanging fruit. If someone thought a topic deserved an article, someone in the future may think the same thing and recreate it in better form. Citizendium recognized the advantage of redlinks early on, and decided to start from scratch rather than from a Wikipedia dump.

And while about two-thirds of those polled want to see Flagged Revisions implemented, the other third think it would be too much of a dilution of the “anyone can edit” ethos. Although I’m in favor of Flagged Revisions, it’s not clear to me whether it would improve or worsen the problem of commnity atrophy. It’s a question of balance: some people are drawn in by ‘instant edit gratification’, while others are turned off by the perceived free-for-all nature of Wikipedia and assume their contributions would simply be swept away in the chaos. So the lure of stability might or might not outweigh the immediate thrill of seeing one’s edits go live. (I suspect the waiting, and the tacit acknowledgement of good work when someone approves a newbie’s edit, would do more to draw in new users to the community than the instant, impersonal status quo.)

So how would improved usability shake things up? On the one hand, it might spark a wave of naive article creation followed immediately by a wave of deletion of new content produced by newbies with no grasp of the community’s standards. If someone can’t figure or won’t figure out how to use basic wiki markup (says the cynic), how can we expect them to use proper sourcing and adhere to Wikpedia’s core policies of NPOV and Verifiability? Lowering the barriers to entry might just exacerbate the us-versus-them mentality of deletionism. On the other hand, maybe a host of new users would integrate well with the community and restore some of its past vitality while pulling the philosophical center back a bit from the deletionist brink. (Of course, it’s an open question how much usability improvements could actually affect the influx of new users; the difference might be rather small, if lack of tech savvy is highly correlated with other factors that make people unlikely to edit.)

As Erik Zachte has pointed out (in the earlier version of this post), many Wikipedias are still growing; English Wikipedia is not the be-all, end-all. It is not clear whether each language will follow a similar pattern in the rise and peak of community (accounting for number of speakers, connectivity, and economic issues) or whether different languages can develop sufficiently different Wikipedia cultures to avoid the failings of English Wikipedia (or perhaps generate unique problems of their own).