Category Archives: technology

What will Wikipedia be like 5 years from now?

With the continued growth of Wikipedia and its sister projects, it’s worth asking what the Wikimedia ecosystem will look like down the road. Here’s my vision of what it will and/or should be like.

Necessary functional improvements:

  1. Search. Wikipedia’s current internal search program is horrible. It is bizarrely sensitive to case, but lacks all the features we’ve come to expect from search. Quotation marks mean nothing. Results are often woefully incomplete (I often have to use a site-specific Google search to find what I’m looking for on Wikipedia). The interface is clunky, especially with all the check boxes at the bottom for different namespaces (and the fact that checking/unchecking only registers if you use the right search box, of the three available). But when search finally gets done right on Wikipedia, it will be a great thing; we’ll need a new verb to complement “to google” (“look it up on Wikipedia” just doesn’t have the same ring). Wikipedia search will be cross-project, with redirects and related entries (Wiktionary and Wikisaurus, Wikimedia Commons, articles in other languages) nested together. It should have some of the elements of Google’s search algorithm; the readable text of piped links should affect results, and results should be ordered by a sort of internal PageRank with the option of reordering them by size, date of last edit, etc.
  2. Stable versions and Approved versions. It’s been in the works for a while now, but there is still no system for managing stable articles where acceptable edits are few and far between, nor is there a good way to flag vetted versions (e.g., a version approved as a Featured Article). Semi-protection is a mediocre substitute for version control, while proposals to implement similar features manually have been too complicated for the community to accept. For stable, largely complete articles, new edits should not show up until they have been screened by one or a few other editors. And for ultra-stable articles, there should be an integrated system for revision and draft work while the consensus version remains viewable to readers.
  3. Audio/Visual accessibility. Because the major formats are all patented and could potentially have significant use limitations placed on them, Wikipedia uses Ogg files with free and open encoding to store and serve audio and video content. For the most part, users must go through a bit of trouble (i.e., downloading and installing codecs from off-site), although audio content now has rudimentary in-browser support. Obviously, the ideal would be integrated audio-video content without leaving the article; YouTube and Google Video have done this fairly well, though with proprietary technology (Adobe Flash with patented codecs). Video (both historical and user-created) will undoubtedly become a much bigger part of Wikipedia and Commons in the future.
  4. Unified login. Obviously, it would be convenient to have a single account for all the Wikimedia projects. It’s been in the works for a while now, but it’s more of a convenience for editors (and a correction of a design flaw) than a major improvement.
  5. Metadata handling. The current system of templates, categories, and other article metadata (beyond basic linking and formatting markup) is unintuitive, inconsistent, awkward, and intimidating to new editors, and the categories are difficult to navigate and far less useful than they could be. Something like a metadata namespace, for infoboxes, categories, Featured Article stars, interwiki links and the like, would be very beneficial.
  6. Categories. Related to the metadata issue, the category system needs to be completely overhauled. In the current system, categories must be divided and subdivided to maintain usefulness, and editors (new and established) often apply overly general categories to new articles. Instead, Wikipedia subdivides large categories into more specific ones. Broad categories like “American people” or “Songs” must be constantly monitored so they do not grow out of control. For example, for a given song, the subdivision branches into a wilderness of partially-overlapping subcategories like “songs by year”, “songs by artist”, “songs by lyricist”, “songs by nationality”, and “songs by genre”, along with a host of other possible orthogonal categories like “songs with sexual themes” and “cat songs”. Ideally, categorization would be both simpler and more flexible. Assigning broad categories (“songs”, “folk music”, “1963”, “protest”, “Bob Dylan”), with some semantic information (“is”, “from”, “related to”, “performed by”) should automatically create appropriate subcategories (Blowin’ in the Wind is a song and is folk music , from 1963, related to protest, performed by Bob Dylan).

Hoped-for functional improvements:

  1. Verifiability assessment. Eventually, Wikipedia will need a way to sort articles according to verifiability and sourcing (as a proxy for reliability, the direct measurement of which will always run into the problem of self-reference and the authority of editors). Readers should be able to tell immediately (before even beginning to read) whether an article is based on peer-reviewed articles and scholarly books, mainstream media sources, local or niche-oriented professional journalism, blogs and internet sites, primary sources, etc. Potentially, this could solve some of the perennial contentious issues about notability and the borderline of original research. The volume of material on minor topics (especially related to popular culture, current events, and minor/local institutions) is growing much faster than it can be strictly vetted (and deleted when appropriate) according to the current notability and verifiability guidelines, and there is a lot of material that is de facto acceptable, even if it doesn’t strictly comply with the current rules. And a lot of this is good, accurate material that readers and editors find useful. If material with few or potentially unreliable sources is clearly flagged as such, there will be less incentive to wage futile wars of deletionism on what is undeniably valuable. In other words, a compromise between elitist and populist visions of what Wikipedia should be.
  2. Discussion forums. I envision a discussion board for each article, separate from the talk page, where users (editors and readers alike) can discuss the subject of the article without the concern of trying to improve the article. This departs somewhat from the core mission of Wikipedia, but I think it would be beneficial is several ways. First, it would direct most of the irrelevant commentary away from talk pages, making collaboration among editors run more smoothly. Second, it could host ads for the support of the Wikimedia Foundation, without compromising the non-commercial nature of Wikipedia itself. And third, it would enhance the usefulness of Wikipedia at the borders of verifiability; readers who want more than the article has to offer can turn to the other forum participants for the speculation, rumor, and strained interpretation they seek.
  3. Stat tracking. Mainly for performance reasons, Wikipedia does very little in the way of internal stat tracking. But in the long run, it would be useful, both for identifying popular articles and for studying Wikipedia itself. In addition to hit counters for every article, the site should track (without retaining any potential identifying information) visit paths as readers surf from one article to another. And for those with editcountitis, some automatic sophisticated contribution analysis (like what can be done through JavaScript hacks by knowledgeable editors now) would be nice: things like total content added, deleted, histograms of edit size and frequency, etc.

So what will the future Wikipedia be like in a broader sense? Its cultural authority and perceived reliability will continue to increase, but surely both will begin to level off within the next few years. Traditional non-specialist encyclopedias will simply be irrelevant, and probably bankrupt. Given the degree of brand success Wikipedia has already achieved, the chances for a successful fork are quickly approaching nil. Citizendium seemed like it had an outside chance at becoming a viable competitor, but it has been managed poorly thus far and I think the window of opportunity is closing rapidly. Citizendium membership is turning out as odd mix of people who don’t edit Wikipedia because it doesn’t respect (their) authority enough, and because it respects authority (of published sources) too much; thus, many of the same issues that drive experts away from Wikipedia will show up in Citizendium if it grows large enough to matter. If it retains the GFDL license, Citizendium may have a place as a minor satellite of Wikipedia from which content is occasionally imported.

Wikipedia will also seriously eat away at the specialist encyclopedia market. I expect the viability of specialist encyclopedias will vary by field, according to which experts embrace and contribute to Wikipedia. In general, scientists (especially in the “harder” fields) and mathematicians have shown a great deal more enthusiasm than humanists, with social scientists somewhere between. (I find this ironic, because humanities fields have so much more to gain from an integrated and cross-linked ecology of knowledge; despite constant flux and discipline genesis at the borders and the current rhetorical vogue of “interdisciplinary” research, science topics are relatively self-contained compared to humanities topics.) It’s an open question whether the academic culture of the humanities will get on board in a significant way. Unfortunately, I think the Ivory Tower mentality and its paradoxical counterpart of academic careerism (especially in the current tight job market) are too entrenched; I expect participation just to continue with incremental gains through the recruitment of individual humanist Wikipedians.

As more and more people look to Wikipedia as their first (and often only) source for arbitrary information, Wikipedia will begin to seriously encroach on the market share of the search companies. It’s entirely possible that one or more of the major portals (most likely and Yahoo!) will replace Wikipedia search results with mirrored content with added advertising. And if implemented well, some users might even prefer this; after all, ads results are sometimes just what you were looking for. (Similarly, Wikipedia itself might implement optional ads, which would only appear if explicitly enabled by users.) The ecosystem of value-added and exploitive businesses making a living off of Wikipedia will expand dramatically, which is bound to create plenty of unforeseen issues and controversies. But I don’t expect any major crises in that respect, since Wikipedia has always been built with the (legal and practical) potential for commercial exploitation.

The bigger problem will be professional PR and information management. In the next year or two, Wikipedia will have to create a system to deal with the complaints and requests of powerful economic and political entities. The recent Microsoft brouhaha over paid editing is the tip of the iceberg. It will be a challenge to create a system that is acceptable to the community but also acceptable enough to outsiders that they will use it instead of guerrilla editing. However, 5 years from now I think there will be some kind of stable equilibrium through a combination of an official system for dealing with accusations of bias from article subjects and vigilant groups of Wikipedians on the lookout for whitewashing.

In addition to encyclopedias, search, and PR, a number of other industries are going to feel pressure from the free content behemoth of Wikimedia projects. Wikimedia Commons will cut drastically into the market for stock photography, although Getty Images and Corbis will still have control of plenty of images that can’t be reproduced, and free media from limited-access venues (like celebrity functions) will still be hard to come by. (Wikipedia has tried, unsuccessfully thus far, to get Wikipedian photographers into red carpet events and award shows.) The glut of easily available images is already prompting stock photography companies to go the MPAA/RIAA route of suing liberally over copyright.

Politically, Wikipedia will do a lot to foster the free culture movement and especially to improve the atmosphere for copyright reform. It’s probably too optimistic to expect a reduction of copyright terms within the next five years, but at least any further extension (beyond the atrocious Copyright Term Extension Act of 1998) should be unlikely. Unfortunately, there’s no good way to show people how lame 95 year copyright terms are until the great content from the 1920s, 30s, 40s, and 50s starts to come into the public domain. (That stuff is our cultural heritage and ought to be in the public domain already; I think something like 50 years or the life of the author plus 20 is more than enough protection to serve the intended purpose of copyright.)

That’s all…the crystal’s gone dark.

Cultural change in the modern world

My manifesto post got picked up by OU’s patahistorian David Davisson for the latest History Carnival. From there, I happened upon a Crooked Timber post by John Quiggin on “the traditionality of modernity,” a clever way of saying that, contrary to common historical intuition, cultural change is slowing down… and fast.

In a nutshell, technology-induced mass/global culture tends to make major cultural changes less, not more common. Elements of this include:

  • The standarization of written language following the printing press, a trend that is rapidly become panlingual (“it’s expected that during the 21st century the number of language in the world will go from 6,000 to 300”).
  • The permanent fixation of/on the foundational pop culture icons like Marilyn Monroe or The Beatles (a dubious contention, but maybe “Marilyn will, inevitably, fade, but never be replaced on her pedestal”).
  • Globalization, reification and simplication of many previously local traditions: styles of food, artforms, forms of national government (or the beginning of the end thereof, with the EU and global economic institutions).

I’m still not sure how much of this I buy as a general statement, but some of it at least is true, and some of it is lamentable. Whatever truth there is to this technology-leads-to-cultural-hegemony thesis, it’s obviously somewhat more complex, and I think somewhat more positive, than the general tone of discussion at Crooked Timber. I won’t particularly mourn the death of 5,700 languages, despite whatever profoundly different ways of thinking such languages might or might not enable. There are more than enough socially constructed boundaries of thought to hamper communication and exchange (e.g., academic disciplines, nationality) , and subcultures proliferate mightily in the modern world, providing ample breeding ground for new ideas and traditions while retaining the ability to swiftly reconnect to mainstream culture (or other subcultures) when necessary.

My course with Jean-Cristophe Agnew ( The American Century, 1941-1961 ) has been great, and it provides a jumping-off point for assessing this cultural hegemony idea. The premise of the course, which I’m increasingly convinced of, is that those two decades (give or take a few years) formed the basis of American culture since that time; nearly all the significant shifts of the later 20th century had their origins then and cultural events from the period are still frequently relevant today. This period, along with the turn-of-the-century rise of the even-nebulous “modernity” (which I studied with Ole Molvig last semester, incidentally) were singled out in the Crooked Timber discussion as periods when it seemed cultural change was especially rapid compared to today, and I would generally agree.

But I also think we’re seeing the beginning of the reversal or supersession of the homogenizing trends in American culture that have been in play since the 60s. Widespread television broadcasting and the other biproducts of defense research from WWI and WWII are finally being overtaken in cultural significance by the Cold War research legacy of computers. Along with this comes “the long tail,” the massive diversification of cultural products that is just beginning. The hit for music and the blockbuster for movies (the things that make radio and theaters so lame today) are both dying economic modes; they’re being replaced by niche-centric media such as digital music stores, Netflix (which apparently has a superb recommendation system that facilitates discovery of movies both new and old that escape mainstream attention), and other “new economy”-style retailers that make niche-content profitable again.

Mainstream media is not likely to die completely, and its current troubles only make it even more homogenous and derivative… witness current trend of mergers in news agencies, the fact that half the shows on network TV are Law and Order spinoffs (some day I’ll write a post about the pernicious political effect those shows must have), and the fact that the only truly good blockbuster from last year was not from Hollywood, and even it followed the current formula of sticking to established franchises and/or well-worn classic plots. (Neo-noir comic book male-fantasy shoot-em-up with computer graphics… seemingly the least original movie possible.) But again, I see some silver lining to retaining and even enhancing a cultural baseline as a backdrop for the vibrant long tail of culture. The key is to improve that cultural baseline (the point of my recent manifesto), but I think there is more hope for that project now than at any time since the rise Cold War culture. The fact that these issues regarding the interplay technology and culture are becoming visible means we needn’t feel trapped by any technological determinism; now is the time to determine the shape of mass culture for the next century.

This is, of course, a very modernocentric (is there better word this?) view. What about all the full-blown culture(s) being obliterated by the shift to modernity? I don’t know how to answer that… I’ve never been too enthralled by anthropology and the idea of culture for culture’s sake. The modern/post-modern long tail world will make it easier preserve parts of traditional culture, but transitions to modernity will still entail a lot of suffering; the results of the current world picture look a somewhat more promising than the fruits of 50s and 60s modernization theory, even though not much has fundamentally changed (besides the end of the Cold War).

Alas, that’s probably enough of an incoherent rant for one night.