Biting the newbies on Wikipedia

An example:

  1. New user creates article in good faith.
  2. Two minutes later, editor tags it for speedy deletion; article “does not indicate the importance or significance of the subject” (even though it does make a basic claim of notability).
  3. New user responds on talk page, explaining in more detail why the subject is significant by noting newspaper and magazine coverage.
  4. Administrator deletes article without either checking talk page or verifying speedy deletion rationale: “No indication that the article may meet guidelines for inclusion”.
  5. Newspaper publishes yet another piece of journalism that makes Wikipedia seem like a petty and unfriendly place and shows how overzealous deletion makes Wikipedia worse.

Will the Stanton usability grant stop Wikipedia community atrophy?

The recent Stanton Foundation grant to improve MediaWiki’s usability hopefully will lower the barrier for computer novices to get started on Wikipedia editing. This comes at an opportune time: we recently learned that the size of the Wikipedia community has not only stopped growing exponentially, it actually has been gradually shrinking since early 2007. The most likely causes of the decline include:

  • lack of “low-hanging fruit”
  • lack of new potential editors who are just discovering Wikipedia
  • Wikipedia’s scope gradually narrowing to mirror that of traditional encyclopedias (a.k.a., deletionism run amok)
  • Wikipedia’s occasionally expert-unfriendly culture that turns off those with the most to contribute
  • a Wikipedia culture that gives little priority (or even respect) to activities focused on the community itself rather than the encyclopedia
  • the natural decline in participation of early community members; according to Meatball Wiki, users of any online community generally say GoodBye after between 6 months and 3 years unless that community is connected to their offline lives

Usability improvements, it is hoped, will open editing opportunities to people who are scared off by the intimidating and sometimes overwhelming markup that appears when one clicks “edit”.

Whether or not this will halt or reverse the decline in editing activity on English Wikipedia is tied up with several conflicting currents of thought in the community. As Liam Wyatt and Andrew Lih have been pointing out in recent Wikipedia Weekly podcasts (66 and 68 are both very astute discussions), the standards for what is and is not valuable content have been shifting consistently towards the convential encyclopedia definition of valid topics. Quirky lists, small organizations that don’t meet the ever-harsher notability standards, obscure books and concepts, anything ScienceApologist finds to be an illegitimate invocation of scientific authority, anything deemed too ‘mere news’, and, increasingly, simply anything that wouldn’t be found in tradional encyclopedias–these are candidates for deletion.

The implications of deletion trends for community health are not entirely straightforward. Overzealous deletion leaves a sour taste in the mouths of many editors who have spent a lot of time adding the kinds of content that now gets deleted regularly. Some leave because of it, or lose their enthusiasm. On the other hand, a lot of what gets deleted is simply weak, unsourced content; removing it the article pool means that new editors will not base their own contributions on such bad examples. Deleting content on the borderline of notability, or better yet, downright notable and significant topics, also replenishes the supply of low-hanging fruit. If someone thought a topic deserved an article, someone in the future may think the same thing and recreate it in better form. Citizendium recognized the advantage of redlinks early on, and decided to start from scratch rather than from a Wikipedia dump.

And while about two-thirds of those polled want to see Flagged Revisions implemented, the other third think it would be too much of a dilution of the “anyone can edit” ethos. Although I’m in favor of Flagged Revisions, it’s not clear to me whether it would improve or worsen the problem of commnity atrophy. It’s a question of balance: some people are drawn in by ‘instant edit gratification’, while others are turned off by the perceived free-for-all nature of Wikipedia and assume their contributions would simply be swept away in the chaos. So the lure of stability might or might not outweigh the immediate thrill of seeing one’s edits go live. (I suspect the waiting, and the tacit acknowledgement of good work when someone approves a newbie’s edit, would do more to draw in new users to the community than the instant, impersonal status quo.)

So how would improved usability shake things up? On the one hand, it might spark a wave of naive article creation followed immediately by a wave of deletion of new content produced by newbies with no grasp of the community’s standards. If someone can’t figure or won’t figure out how to use basic wiki markup (says the cynic), how can we expect them to use proper sourcing and adhere to Wikpedia’s core policies of NPOV and Verifiability? Lowering the barriers to entry might just exacerbate the us-versus-them mentality of deletionism. On the other hand, maybe a host of new users would integrate well with the community and restore some of its past vitality while pulling the philosophical center back a bit from the deletionist brink. (Of course, it’s an open question how much usability improvements could actually affect the influx of new users; the difference might be rather small, if lack of tech savvy is highly correlated with other factors that make people unlikely to edit.)

As Erik Zachte has pointed out (in the earlier version of this post), many Wikipedias are still growing; English Wikipedia is not the be-all, end-all. It is not clear whether each language will follow a similar pattern in the rise and peak of community (accounting for number of speakers, connectivity, and economic issues) or whether different languages can develop sufficiently different Wikipedia cultures to avoid the failings of English Wikipedia (or perhaps generate unique problems of their own).

Wikipedia blogging outside the Wiki Planet orbit

The main discussion platforms in the Wikimedia community can be pretty insular. Lots of people write about their (often negative) experiences with and views on Wikipedia, and only a handful are part of Wiki Blog Planet, post on the Village Pump or the mailing lists, or hang out on freenode IRC. So I like to browse the wider world of Wikipedia blogging. Lots of other people do this too, I know, because usernames I recognize often appear in comments sections. Here’s what I found this time.

  • Have You Ever Edited Wikipedia? – a thoughful post on notability by Terrance of The Republic of T. , explaining why he stopped contributing articles on victims of LGBT-related hate crimes.
  • The coming Wikipedia election. – an interesting take on the way Wikipedia is increasingly significant for U.S. state-level politics, by an Virginia political junkie (User:WaldoJ).
  • Is it safe to edit Wikipedia? – Kelly Martin’s Nonbovine Ruminations isn’t on the wiki blog aggregators any more, but she posted this a few weeks ago.
  • final group project: editing USF’s wikipedia page – University of San Francisco media studies professor David Silver is running a Wikipedia assignment, group editing of the USF article. To be more precise, he’s grading it. The comments on his blog post are heartwarming. See how the USF article has changed since the assignment started two weeks ago.

Also, for those who haven’t seen it, Robert Rohde (User:Dragons flight) has some vital, long-wanted editing frequency statistics for the English Wikipedia community. The long and short of it is that the size of the Wikipedia editing community peaked around March 2007. I’ve been playing around with the data, and there are lots of interesting things hiding in there.
Editing_frequency_-_20_mainspace_edits,_2001-2008
The big research/data crunching questions I have now relate to what the life course of a Wikipedia editor looks like? Anecdotally, active Wikipedians have a typical lifespan of a few years; most of the early contributors have left, and many of the most active editors today joined around the time I did or later (that is, in the 2005-2007 boom). Do many or most editors follow a typical pattern in their editing rate over the course of their involvement (e.g., rapid rise that levels off, then gradually declines before fading away)? Can we expect (or are we experiencing) a generational die-off in the wake of the exponential expansion period? What would a histogram of recent edits sorted by when editors joined look like?

Tougher questions that probably can’t be answered directly even with really great statistical analysis: Does Wikipedia attract a different kind of editor than it used to? How much of the pool of potential editors has been used up? Are there really significant numbers of potential editors who would contribute if usability issues were addressed?

The scientist in TV dramas

This is a widely-acknowledged Golden Age of American television drama (led, of course, by cable shows, but with network fare that also has its high points). (I’m two discs in to Deadwood right now, which is the one show that is usually mentioned in the same sentence as The Wire in terms of really great shows.) One remarkable thing that’s happened recently, especially this season, is the flood of scientists as main characters. Several established shows have main characters who derive much of their identity, and personality, from being scientists: House, Bones, and (to some extent) Mohinder Suresh from Heroes. More than earlier shows in the same genres (medical dramas, forensic science dramas, superhero dramas), these shows and their characters explore the meanings of what it is to be a scientist in modern society.

But two new shows this season, Fringe and Eleventh Hour, are about science to an unprecedented extent (even including The X-Files and Star Trek: The Next Generation, but excluding CBC’s ReGenesis and the four-episode British version of Eleventh Hour, neither of which I’ve yet seen).

Fringe, and its main scientist character, showcase science-as-threat; Walter Bishop is Dr. Frankenstein for the era of Big Science. In his previous scientific life, Bishop had worked for the government and others on an endless array of “fringe science” research projects, mostly aimed in various ways at controlling the minds and bodies of people living and dead. Institutionalized for years, Bishop is now out and, working out of his old lab at Harvard, is helping the FBI investigate “The Pattern”, a big-business-linked series of weird and deadly happenings that are often the scientific monsters Bishop had helped to create. In Fringe, science is not just a threat to society, it is (inherently?) a threat to the moral fiber and mental health of the scientist. Bishop is an otherwise kindly old man whose broken personality is centered on a self-centeredness that is presented as, at least partly, a mental health issue, and alternately child-like naivety and (in the course of performing science) shocking callousness. Fringe is by no means a serious show, but it does articulate an interesting, and I think significant, interpretation of what it means in American culture to be a scientist.

If Fringe is in part inspired by the works of Michael Crichton, as creator J. J. Abrams claims, then Eleventh Hour is inspired by the other part of Michael Crichton’s works–that is, the part that deals with the moral and ethical dimensions of science as it is actually practiced, rather than the outlandish threats of science gone wild. The compelling main character, biophysicist Jacob Hood, also works for the FBI investigating science-related crimes and mysteries. But where Walter Bishop is pulled, out of dire necessity, from an asylum, Hood was recruited because (in addition to his brilliance) he was friends with someone who ended up in a position of power in the FBI. Most of the crimes involve acute threats to one or a few people, but there is no overarching conspiracy, no Pattern of misused science. Rather, the criminals are usually scientists doing realistic but scientifically/ethically/morally questionable research (often in commercial contexts), or the people who oppose what they do. Hood treads the line of genuine scientific enthusiasm (often accompanied by patronizing bemusement at his female FBI handler’s scientific ignorance), and ironic detachment and quiet disapproval of less-than-pure but not egregiously bad ways of doing science.

What does recent prominence of science and scientists tell us about American culture and the place of science in society? I don’t know, but I feel that it’s my scholarly responsibility to keep watching until I figure it out.

These are the kind of stories Wikinews should be doing

The election numerology blog fivethirtyeight.com has been publishing a series of fascinating “On the road” posts by Sean Quinn and photographer Brett Marty. Quinn and Marty have been traveling through battleground states investigating the “ground game” of the McCain and Obama campaigns, reporting on the voter registration and get-out-the-vote operations managed by volunteers and paid staffers in the regional and local campaign offices.

See the latest few:

Individually, these might seem minor, but the series as a whole makes for an important story that has been largely neglected by traditional news sources. It’s also the type of thing Wikinews could excel at, with a little more organization. Wikimedians all over the U.S. could go out the same weekend and do stories on the local dimensions of these national campaigns, and the result could be something very special.

Bonus link:

  • The Wikipedian Candidate – an interesting analysis of the (it seems increasingly clear) ill-advised selection of Sarah Palin as McCain’s VP and the important things that don’t come across in a Wikipedia article, from fivethirtyeight.com’s Nate Silver

How are your Wikimedia Commons photos being used elsewhere?

I don’t know about yours, but I do have some idea of how mine are being used.

Google searches for my name and my username reveal a lot more instances than I was aware of, especially for news article illustrations.

In the “license, schmicense” category, I found this article from The Jerusalem Post, which takes a recent photo of mine (either from Flickr or Wikipedia, but more likely Wikipedia) as simply says “Photo: Courtesy:Ragesoss”.

Marginal cases include the hundreds of Google hits for “ragesoss” come from World News Network websites. This organization runs thousands of online pseudo-newspapers, such as the West Virginia Star and Media Vietnam, that aggregate content from real news organizations. Stories at all of their portals link to World News pages that have teasers for the actual articles at the original sources. And I’ve found a bunch of my photographs as illustrations on these pages. See these:

Of course, my photographs are not the ones used by original articles. World News seems to have used almost every photo I uploaded from the February 4 Barack Obama rally in Hartford, to illustrate campaign news unrelated to the Hartford rally. In terms of photo credits (see the links), most of them they say “photo: Creative Commons / Ragesoss” or “photo: GNU / Ragesoss”. Nearly all of my photos on Wikimedia Commons are copyleft under GFDL and/or CC-by-sa, so non-specific credits like that do not constitute legitimate use under the terms of either license. The GFDL requires a link to the license (GFDL, not “GNU”), and CC-by-sa at least requires notice that the image is free to reuse as long as derivatives are issued under the same license (simply “Creative Commons” is not a license). It is also implicit with CC licenses that credits for my photos should include a link to my Commons userpage, since the author field on the image pages is typically a link titled “Ragesoss”, not just the text. (The third link above, among others I found, does link to the GFDL, although the photo has nothing to do with the article.)

Another major user of my photos is Associated Content, a commercial user-generated content site that pays contributors. AC is a mixed bag in terms of legitimate uses of photos, since individual contributors are responsible for selecting and crediting the illustratons for their articles. This one, which uses a photo of Ralph Nader, credits my shot as “credit: ragesoss/wikipedia copyright: ragesoss/GNU FDL 1.2”. It almost meets the basic requirements of the license (all it needs is a link to the text of the license), although a link to the source would preferable to simply mentioning Wikipedia. This one, on the other hand, just says “credit: Ragesoss copyright: Wikimedia Commons”.

Popular Science, in this article, lists the GFDL, but links it to the Wikipedia article on the license rather than the actual text.

The Bottle Bill Resource Guide links to my Commons userpage, but does not list the license or link to the image source.

Another partly-legit use is by LibraryThing, a book related site that uses several of my photos for authors (e.g., Dava Sobel). They include links back to the original image pages, but the site behaves erratically and sometimes insists on me signing in or creating an account to view the image details.

Unexpectedly, I also found several of my photos illustrating Encyclopedia Brittanica. See:

In each case, they provide a link to one of the licenses (GFDL 1.2 and CC-by-sa 3.0 unported, in these cases), although they don’t provide a userpage link. At least they seem to take the licenses seriously.

Of course, it’s much tougher to find out where my photos are being used without mentioning me at all. I suspect that the majority of uses don’t even attempt to assign credit or respect copyright. Most of the publications that are serious about copyright aren’t even willing to use copyleft licenses, preferring to get direct permission from the photographer (even if it means paying, often).

Fun photo project for Wikipedian photographers

Taking pictures to illustrate Wikipedia articles is the reason I got into photography. I started with my wife’s point-and-shoot, and pretty soon I started to appreciate the joys of photography for their own sake…and I started to experience that strong desire for better and still better equipment. A few weeks ago I finally realized my long-time goal of shooting an original Featured Picture (FP), this ‘Peach Glow’ water lily.

My equipment (Canon EOS 400D, 50 mm prime lens, 18-55mm kit lens, and low-end 70-300mm superzoom/macro) is not professional, but it’s not cheap either. With my setup and my intermediate skill level, the circumstances under which I could take an FP are pretty narrow.

But there are many opportunities for taking valuable photos for Wikipedia. A project that I just completed, which many American Wikipedians could do as well, was to take photos of every Registered Historic Place in my town. In West Hartford, there are 28 Registered Historic Places, only a few of which had images or articles. But there is a wonderful List of Registered Historic Places in Hartford County, Connecticut, that lists the addresses and geographical coordinates for every one in my town and the surrounding towns. It has slots for thumbnail images, so even the ones without articles have a home for photos, and there is even a Google Maps link at the bottom that maps out every place on the list.

I spent a couple days doing bike trips to all the West Hartford places on the list, and now I’ve shot them all. Now I’m starting a series of longer trips to shoot the places in neighboring towns. It’s definitely been worthwhile; I learned a lot about local geography, got some exercise, and took a bunch of photos.

Not all local NHRP lists have the useful table format that the Connecticut lists have (and the Western U.S. has relatively few registered places), but the NHRP WikiProject can help and there is a tool for automatically generating formatting lists by county. There are currently only a handful of lists that are fully illustrated so far, but I hope eventually to add the Hartford County, Connecticut list to that group. An even more ambitious goal would be to create articles for all the places on that list, but I’m afraid there may not be relevant sources for most of them.

Wikipedia’s epistemological methods

A colleague of mine recently asked me about Wikipedia’s policy on sources and evidence, Wikipedia:Verifiability (WP:V). In short, the threshold for including content in Wikipedia is that it be “verifiable, not true”. Truth alone, without appropriate evidence that fits with the Wikipedia community’s standards, is not enough to justify adding something to Wikipedia.

You can interpret this in a number of ways. For some, it’s an embodiment of post-modern notions of truth and subjectivity (people disagree about truth, so we don’t let people simply add what they know to be true, instead relying on authority). For others, it’s just a practical concession to the sociological nature of Wikipedia, in which some people are more objective and more capable than others (and those are the people that know how to leverage authority effectively). The Verifiability standards could also be taken as a fundamentally rhetorical, rather than epistemological, policy: communal standards of evidence ensure a basic level of apparent reliability, since readers can be pointed straight to relevant authorities. (Citizendium, in contrast, as has looser evidentiary standards and relies in part on the personal authority of its Authors and Editors.)

From an academic standpoint, there are plenty of relevant sets of literature that bear on the problems that Wikipedia’s evidentiary standards and policies attempt to deal with. But from my own perspective as a historian of science, I think the parallel to scientific epistemology and evidentiary norms is an interesting one.

WP:V works in ways that are closely paralleled in scientific (and historical) method as it is actually practiced. Communities of scientists have various norms (mostly unwritten) for what does and does not constitute legitimate evidence for making novel scientific claims. These norms are highly context dependent, and can include (for exclude) experiments, reference to the work of others, reasoning and rhetoric, visual evidence, artificially simulated data, etc., depending on field and venue. Verifiability in the traditional scientific sense of experimental repeatibility is actually very rarely a consideration in science (and in fact, many philosophers and historians of science have argued that repeating experiments is rarely possible and almost never desirable… the questions instead are, do the results accord with the results of related experiments?, can we build on these results?, etc.)

Science, as scientists are increasingly willing to admit in recent decades, is about what is verifiable rather than true in a similar sense to WP:V, since experimental science is increasingly conducted in largely artificial physical contexts. What happens in the lab is hoped to be a faithful reflection of what happens in nature, but the whole point of the lab is to isolate certain parts of nature so that they can be studied without all the complicating factors…and sometimes those complicating factors mean that a given experimental result may actually only be “true” for the very peculiar and artificial set of circumstances tested. The analogous situation on Wikipedia is when a seemingly reliable source is wrong; all the Wikipedian can do, without other sources to compare it to, is either limit claims to “source X says Y” (instead of just claiming Y and citing X) or ignore the source altogether. On Wikipedia we also hope that what the sources says accords with reality, but (for sociological rather than technical reasons) editors can’t go out and probe reality in its full complexity and must stick within the (negotiable) norms (which, like in science, are tailored to try to maximize the chances of accord between evidence and reality).

WP:V, and Wikipedia’s approach to sourcing and evidence more broadly, is just a different set of evidentiary norms, suited to a different group of people with a different purpose.

Wiki FM

I’ve been using Wiki FM lately, a simple mashup of Last.fm and Wikipedia. (Thanks goes to Florence Devouard, who pointed this site out on the English Wikipedia mailing list.)

In addition to discovering new music (and rediscovering music I haven’t listened to in a long time), I’ve been filling in gaps in Wikipedia’s music photography. There are many band and musician articles that either don’t have photos or are using non-free promo shots, while Flickr has freely-licensed photos that could be used. And for many others, there are fan-made photos that Flickrites would release under a free license if asked. It’s a nice way to work on some of Wikipedia’s image problems without it feeling like drudge work.