Diderot — a Pebble watchface for finding nearby unillustrated Wikipedia articles

photo-nov-05-2-52-49-pmI published a watchface for Pebble smartwatches that shows you the nearest Wikipedia article that lacks a photograph. Have a Pebble and like to — or want to ­— contribute to Wikipedia? Try it out! It’s called Diderot. (Collaborators welcome!)

After using it myself for about a month and a half, I’ve finally added photographs to all the Wikipedia articles near my house within the range of Wikipedia’s ‘nearby’ API.

Extra thanks go to Albin Larrson, who built the WMF Labs API that my app uses to find nearby unillustrated articles. The great thing about it is that it filters out articles that have .png or .svg images, so you still find the articles that have only a map or logo rather than a real photograph.

On copyright infringement and “theft”

Yesterday I went to an open discussion about SOPA with Jason Altmire, who represents my district. He came out against SOPA at the end of the event. But one thing that bugged me was that just about everyone used “theft” as a synonym for copyright infringement. And this “theft” by rogue websites in China and southeast Asia, everyone supposedly agrees, is a serious problem, even if SOPA isn’t the right answer.

Consider a typical case where somebody downloads a Hollywood movie to watch, without paying for it. Taking this movie wasn’t authorized by the copyright holders. But the copyright holders still own it. They still have all their copies, and they are still free to make more. They can distribute and license it as they wish. They can make sequels and spin-offs and t-shirts and bobble-heads.

What would you call that? I would call it copyright infringement, but I wouldn’t call it theft.

Now imagine a different scenario. A work you have is taken from you. And once it’s been taken, you can no longer make copies. In fact, you have to get rid of all the copies you have. When it was yours, you could make copies, send them to your friends, make derivatives, use it as a jumping off point for new works. You could do with it as you pleased. Now, you can’t do any of that without the permission of the person who took it from you.

Would you call that theft?

I would call it Golan v. Holder. Wikimedians are having to get rid of thousands of public domain works from Wikipedia and Wikimedia Commons that used to be public domain in the U.S.—used to belong to the public, to use and copy and build from—which were put back into copyright by Congress. And the Supreme Court just decided that in fact, that’s just fine.

Flickr, Getty Images, and revoking CC licenses

Flickr started a program earlier this year with Getty Images, in which Getty staff find great photographers and ask them to put some of their work into the Flickr Collection on Getty Images, so that Getty can sell rights to the images and pay the photographers when their photos get licensed.  As the Flickr blog explains, they are now expanding this program: photographers can submit portfolios of their best work to be considered for inclusion by Getty.

When I first came across this Getty Images-Flickr program a few months ago I noticed something interesting in the terms of the program, and it might be a lot more significant now that this program is ramping up.  The FAQ specifically addresses the issue of CC-licensed photos:

There is a chance one of your Creative Commons-licensed photos may catch the eye of a perceptive Getty Images editor. You are welcome to upload these photos into the Flickr collection on Getty Images, but you are contractually obliged to reserve all rights to sale for your work sold via Getty Images. If you proceed with your submission, switching your license to All Rights Reserved (on Flickr) will happen automatically.

If you’re not cool with that, that’s totally cool. It just means that particular photo will need to stay out of the Flickr collection on Getty Images.

But what happens if, say, Wikimedia Commons already has those CC images?  Are Getty and Flickr basically just looking the other way about the fact that in many cases it wouldn’t be possible for photographers to” reserve all rights to sale” on their freely-licensed works that are circulating in the wild, even if they wanted to?  What about intentionally making sure your CC images have been added to Commons and verified by the Flickr review bot before submitting them to Getty?

How freely licensed photos generally get used (a sequel)

Last year, I blogged about how freely licensed photos are used and misused across the web.  Figuring out how my photos are being used (as long as I’m being credited by name) is much easier now with the Google search options (rolled out in May 2009 and with more options added just this month), which let you limit search results to newly indexed pages.

I have over 3500 CC BY-SA photos on Flickr (including lots of family photos, abstract shots, and other stuff unlikely to be reused) and probably around 1000 original photos on Wikimedia Commons, generally available under both GFDL and CC BY-SA (and a good portion of which are not duplicated on Flickr).  At this point there is a fairly steady stream of reuse, most of which I’m not directly aware of (except when I go looking, like now).  I estimate that my ~4000 photos are put to new uses at  rate about 15-20 times per week.  Let’s see what types of uses my photos have been put to recently.

Searches (limited to results first indexed within the last week) for “ragesoss” and “Sage Ross” ought to turn up nearly all of the new cases where I’m being credited for photos.

As before, the most active user of my photos is World News Network (wn.com), a set of algorithmically-generated sites that are titled like local or special interest newspapers but basically just link to offsite news stories, add free photos, and run ads against the photos and headlines.  For example, this story about pesticides in peaches links to the actual story from The Oklahoman but adds my picture of peaches.  The credit reads “(photo: GFDL / Sage Ross)”.  Although I think a link back to the source or my Commons userpage (which is where the attribution link at Commons points) is appropriate, it probably doesn’t violate the letter of the license (which is already stretched thin when applied to photos and other things very dissimilar from software manuals).  In another example, they use a CC license instead of the GFDL for my photo of coffee beans.  In this case, the credit reads “(photo: Creative Commons / Ragesoss)”, with no link to the specific license or the source.  This violates both the spirit and the letter of the CC BY-SA license.  World News Network has used my photos hundreds, maybe thousands of times, and I’m sure many other photos from Commons by other Wikimedians are being systematically (mis)used similarly.

Another common type of usage is from the many sites that are trying to monetize user-generated content and share the ad revenue between writer and website owner.  In these cases, it’s the individual writers who are responsible for obtaining photos (and rights thereto), so compliance with free licenses varies widely.  I found my photos on articles from suite101.com and hubpages.com.  The suite101 article, “Free Instructions on How to Make an Apple Pie“, uses a series of photos I took while my sister was making pie.  All the photos but one are credited to me and link back to the source on Commons, although no license info is indicated at suite101; this violates the letter, but not the spirit, of the CC licenses.  Oddly, the lead apple pie image is misattributed and links to an entirely different pie photo from a quasi-free stock photography site; the writer probably used that image first but then replaced it when she found my photos.  At HubPages, the article “Health Insurance Rescission and How To Fight It” uses my photo but merely credits it as “Photo by ragesoss” with no link or license information.  AssociatedContent is another site like that where my photos show up frequently; they seem to be better than most at following the provisions of free licenses.

Blogs use my images somewhat less frequently.  Recent uses include this entry in the Utne Reader “Science and Technology” blog (which does a great job with the credit line, linking to both source image and the specific CC license) and this one from the Choices Campus Blog (which has the mediocre credit line “Photo Credit: ragesoss at Flickr.com” with no link).

A final significant category of uses is in articles from professional news and content sites.  Overall, these sites are somewhat more likely to use freely licensed images properly, but sloppy or improper uses are still common in my experience.  The only recent credit I found is from the CNBC story “GE, Comcast Continue Talks Over NBC Stake“.  The unlinked credit line simply reads “Photo: Ragesoss”, but the photo is one of my few early photos on Commons that I released as public domain rather than a copyleft license.  So CNBC doesn’t have any legal obligation to give a more precise photo credit (or even to credit me at all), although if only for the sake of journalistic integrity they probably ought to do better.

Conclusion: People use freely licensed photos liberally from Flick and Wikimedia Commons, but there isn’t much indication that most reusers understand what the licenses mean or what they require from reusers.  The free culture movement has a long way to go; cultural change is a lot slower than license adoption.

On a tangent, it’d be nice if Wikimedia Commons was equipped with something like refbacks combined with image recognition to automatically discover and collect web pages that are reusing Commons media.  I think I’ll make a proposal on the Wikimedia Strategy Wiki when I get a chance.

Wikipedia and Olympics Committee heading for collision?

CC-BY-SA photo of Usain Bolt, by Richard Giles

It looks like Wikipedia is  actually at the center of the recent copyright kerfluffle of the photographer (Richard Giles) who got a legal threat from the International Olympics Committee (IOC) over licensing his images from the Beijing Olympics under Creative Commons licenses.  Giles explains the situation on his blog:

It turns out that my Usain Bolt photo was being used by a book shop in the UK to advertise the launch of the Guinness Book of Records 2010. This was being done without my knowledge, and as they pointed out, in breach of the license granted on the Olympic ticket.

That photo was the only one of 293 in the set on Flickr that was licensed with a ShareAlike license (allowing commercial use) rather than a non-commercial license, and Giles had relicensed that particular photo at the request of another Flickrite so that it could be uploaded to Wikimedia Commons and used on Wikipedia.  And Wikipedia is probably where that UK merchant found it and, assuming the license to be legitimate, used it (so it would seem) under the terms of the free license.

Giles reports that it looks like the IOC really just objects to licensing that allows commercial use.   Depending on what the IOC says in response to his request for clarification, Giles may be changing the license on that Usain Bolt photo and asking the UK merchant to stop using it.

What happens now?  By buying a ticket to the Olympics, Giles’ appears to have (implicitly at least) agreed to terms and conditions that say he won’t use photos from the games except for private purposes.  But he does own the copyright to the Bolt photo, and therefore ought to (except for those terms and conditions) be able to license it however he likes.  Will the fine print of an Olympics ticket be strong enough to force Wikimedia (which agreed to no terms and conditions) to stop using the photo and offering it to other downstream users?

Database right and the NPG threat

The National Portrait Gallery’s legal threat against Wikimedian Derrick Coetzee alleges four things:

  1. Copyright infringement
  2. Database right infringement
  3. Unlawful circumvention of technical measures
  4. Breach of contract

The copyright issue, of course, is the center of the dispute. UK law is unsettled on whether mechanical reproduction of a public domain work is eligible for copyright.

IANAL, but breach of contract and unlawful circumvention both seem moot if there is no copyright infringement. A bit of text at the bottom of page (with no mechanism for the user to acknowledge or refuse) setting restrictive use terms for something that is public domain wouldn’t hold much weight. Likewise, even apart from the fact that Zoomify is not a security measure and arguably was not “circumvented”, if the images are public domain then simply collecting and stitching together tiles from those images (whether automatically or by hand) is perfectly legitimate.

Database right, therefore, is the only thing does not turn on whether ‘sweat of the brow’ copyrights hold up. The law here seems vague, but again, IANAL. The key question is what constitutes a “substantial part” of the contents of the NPG’s database. If the paintings themselves are public domain, then the mere unorganized collection of them ought not infringe on the database right, but depending on how much metadata and categorization comes from the same database, porting images to Wikimedia Commons might cross the line. For the images at hand, it looks like the amount of metadata is modest: subject, author, date, and author’s date of death. The NPG database contains significantly more information: medium, size, provenance, and other contextual information, as well as links to related works and people. It is also possible that Coetzee’s actions fall under the “exceptions to database right“:

(1) Database right in a database which has been made available to the public in any manner is not infringed by fair dealing with a substantial part of its contents if –

Self-preservation and the National Portrait Gallery’s dispute with the Wikimedia community

Running an organization is difficult in and of itself, no matter what its goals. Every transaction it undertakes–every contract, every agreement, every meeting–requires it to expend some limited resource: time, attention, or money. Because of these transaction costs, some sources of value are too costly to take advantage of. As a result, no institution can put all its energies into pursuing its mission; it must expend considerable effort on maintaining discipline and structure, simply to keep itself viable. Self-preservation of the institution becomes job number one, while its stated goal is relegated to job number two or lower, no matter what the mission statement says. The problems inherent in managing these transaction costs are one of the basic constraints shaping institutions of all kinds.

From: Clay Shirky, Here Comes Everybody: The Power of Organizing Without Organizations, pp. 29-30 (my emphasis)

Shirky’s book is about “organizing without organizations”, a key example of which is the Wikimedia community (as distinct from the Wikimedia Foundation). The Wikimedia community can accomplish a lot of big projects–making knowledge and information and cultural heritage accessible and free–that traditional organizations would find far too expensive. And that paragraph from Shirky explains the root of the tension between the Wikimedia community and many traditional organizations with seemingly compatible goals–organizations such as the National Portrait Gallery in London, which sent a legal threat to Wikimedian Derrick Coetzee this week.

The NPG has a laudable mission and aims: “to promote through the medium of portraits the appreciation and understanding of the men and women who have made and are making British history and culture, and … to promote the appreciation and understanding of portraiture in all media”, and “to bring history to life through its extensive display, exhibition, research, learning, outreach, publishing and digital programmes.”

But in pursuing self-preservation first and foremost, the gallery asks a high price for its services of digitizing and making available the works it keeps: to fund the digitization of its collections and other institutional activities, the NPG would claim copyright on all the digital records it produces and prevent access to others who would make free digital copies. As one Wikipedian put it, the NPG is “trying to ‘Dred Scott‘ works already escaped into PD ‘back south’ into Copyright Protected dominion”.

If the choice is between a) waiting to digitize these public domain works until costs are lower or more funding is available, or b) diminishing the public domain and emboldening others who would do the same, then I’ll choose to wait.

Can you copyright a bonsai?

Besides Wikipedia, my main hobbies are bonsai and photography. Sometimes I combine all three, taking pictures of bonsai and uploading them to Wikimedia Commons. So the question I have is, does styling a bonsai create a copyright? Can I take a photo of someone’s tree and do what I want with it (e.g., license it freely on Commons), or do I need the owner’s permission?

At first blush, the answer would seem to be yes, bonsai is eligible for copyright. It is a form of visual art, often compared to sculpture. A good bonsai is distinctive, demonstrating the creative vision of the artist who made it.

On the other hand, it is a living thing, and a core principle of bonsai is that it is never finished and always subject to change; according the U.S. Copyright Office, “Copyright protection subsists from the time the work is created in fixed form.” What is meant by fixed form? A bonsai’s form is never truly fixed (in the same way that one’s face is never fixed but develops over time), but (like a face) a well-styled bonsai may be recognizable in the same general form over the course of decades, or even centuries. That’s more than can be said of many traditional works of art, which for some media may deteriorate beyond recognition in just 10 or 20 years. But bonsai typically evolved to a roughly “final” form over the course of many years. When, during this process, is a copyright created? If photograph a bonsai one year and it’s very different the next, I essentially took a snapshot of something that was not at the time in a fixed form. But if I take a picture one year and the bonsai is basically the same the next, does that mean it was copyrighted? Does keeping a bonsai as it lives and grows generate a continual series of copyrights, such that the centuries-old trees that get handed down from generation to generation can never go out of copyright as long as they are alive?

For my part, I’ve assumed that bonsai are not eligible for copyright. Mainly, I do this because there is no tradition within the bonsai community of claiming copyright for bonsai, only for particularly (fixed) pictures of them. If anyone has a more definitive answer, or informed thoughts on the matter, please let me know.

How are your Wikimedia Commons photos being used elsewhere?

I don’t know about yours, but I do have some idea of how mine are being used.

Google searches for my name and my username reveal a lot more instances than I was aware of, especially for news article illustrations.

In the “license, schmicense” category, I found this article from The Jerusalem Post, which takes a recent photo of mine (either from Flickr or Wikipedia, but more likely Wikipedia) as simply says “Photo: Courtesy:Ragesoss”.

Marginal cases include the hundreds of Google hits for “ragesoss” come from World News Network websites. This organization runs thousands of online pseudo-newspapers, such as the West Virginia Star and Media Vietnam, that aggregate content from real news organizations. Stories at all of their portals link to World News pages that have teasers for the actual articles at the original sources. And I’ve found a bunch of my photographs as illustrations on these pages. See these:

Of course, my photographs are not the ones used by original articles. World News seems to have used almost every photo I uploaded from the February 4 Barack Obama rally in Hartford, to illustrate campaign news unrelated to the Hartford rally. In terms of photo credits (see the links), most of them they say “photo: Creative Commons / Ragesoss” or “photo: GNU / Ragesoss”. Nearly all of my photos on Wikimedia Commons are copyleft under GFDL and/or CC-by-sa, so non-specific credits like that do not constitute legitimate use under the terms of either license. The GFDL requires a link to the license (GFDL, not “GNU”), and CC-by-sa at least requires notice that the image is free to reuse as long as derivatives are issued under the same license (simply “Creative Commons” is not a license). It is also implicit with CC licenses that credits for my photos should include a link to my Commons userpage, since the author field on the image pages is typically a link titled “Ragesoss”, not just the text. (The third link above, among others I found, does link to the GFDL, although the photo has nothing to do with the article.)

Another major user of my photos is Associated Content, a commercial user-generated content site that pays contributors. AC is a mixed bag in terms of legitimate uses of photos, since individual contributors are responsible for selecting and crediting the illustratons for their articles. This one, which uses a photo of Ralph Nader, credits my shot as “credit: ragesoss/wikipedia copyright: ragesoss/GNU FDL 1.2”. It almost meets the basic requirements of the license (all it needs is a link to the text of the license), although a link to the source would preferable to simply mentioning Wikipedia. This one, on the other hand, just says “credit: Ragesoss copyright: Wikimedia Commons”.

Popular Science, in this article, lists the GFDL, but links it to the Wikipedia article on the license rather than the actual text.

The Bottle Bill Resource Guide links to my Commons userpage, but does not list the license or link to the image source.

Another partly-legit use is by LibraryThing, a book related site that uses several of my photos for authors (e.g., Dava Sobel). They include links back to the original image pages, but the site behaves erratically and sometimes insists on me signing in or creating an account to view the image details.

Unexpectedly, I also found several of my photos illustrating Encyclopedia Brittanica. See:

In each case, they provide a link to one of the licenses (GFDL 1.2 and CC-by-sa 3.0 unported, in these cases), although they don’t provide a userpage link. At least they seem to take the licenses seriously.

Of course, it’s much tougher to find out where my photos are being used without mentioning me at all. I suspect that the majority of uses don’t even attempt to assign credit or respect copyright. Most of the publications that are serious about copyright aren’t even willing to use copyleft licenses, preferring to get direct permission from the photographer (even if it means paying, often).