on public domain and “public domain”

There has been a lot of great writing about copyright and access to our cultural and intellectual history in the weeks since Aaron Swartz’s death. I have been retreading some of my old favorite haunts to see if there was stuff I didn’t know about the status of access to online information especially in the public domain (pre-1923 in the US) era.

I talk like a broken record about how I think the best thing that libraries can do, academic libraries in particular, is to make sure that their public domain content is as freely accessible as possible. This is an affirmative decision that Cornell University made in 2009 and I think it was the right decision at the right time and that more libraries should do this. Some backstory on this.

So, if I wanted to share an image from a book that Cornell has made available, I have to check the guidelines link above and then I can link to the image, you can go see it and then you can link to the image and do whatever you want with it, including sell it. This is public domain. The time and money that went into making a digital copy of this image have been borne by the Internet Archive and Cornell University. The rights page on the item itself (which I can download in a variety of formats) is clear and easy to understand.

Compare and contrast JSTOR. Now let me be clear, I am aware that JSTOR is a (non-profit) business and Cornell is a university and I am not saying that JSTOR should just make all of their public domain things free for everyone (though that would be nice), I am just outlining the differences as I see them in accessing content there. I had heard that there were a lot of journals on JSTOR that were freely available even to unaffiliated people like myself. I decided to go looking for them. I found two different programs, the Register and Read program (where registered users can access a certain number of JSTOR documents for free) and the Early Journal Content program. There’s no front door, that I saw, to the EJC program you have to search JSTOR first and then limit your search to “only content I can access” Not super-intuitive, but okay. And I’m not trying to be a pill, but doing a search on the about.jstor.org site for “public domain” gets you zero results though the same is true when searching for “early journal content” and also for “librarian.” Actually, I get the same results when I search their site for JSTOR. Something is broken, I have written them an email. [update: they fixed it!]

So I go to JSTOR and do a similar search, looking for only “content I can access” and pick up the first thing that’s pre-1923 which is an article about Aboriginal fire making from American Anthropologist in 1890. I click through and agree to the Terms of Service which is almost 9000 words long. Only the last 260 words really apply to EJC. Basically I’ve agreed to use it non-commercially (librarian.net accepts no advertising, I an in the clear) and not scrape their content with bots or other devices. I’ve also seemingly acquiesced to credit them and to use the stable URL, though that doesn’t let me deep-link to the page with the image on it, so I’ve crossed my fingers and deep-linked anyhow. I’m still not sure what I would do, contact JSTOR I guess, if I wanted to use this document in a for-profit project. Being curious, I poked around to see if I could find this public domain document elsewhere and sure enough, I could.

At that point, I quit looking. I found a copy that was free to use. This, however, meant that I had to be good at searching, quite persistent and not willing to take “Maybe” as an answer to “Can I use this content?” I know that when I was writing my book my publishers would not have taken maybe for an answer, they were not even that thrilled to take Wikimedia Commons’ public domain assertions.

As librarians, I feel we have to be prepared to find content that is freely usable for our patrons, not just content that is mostly freely usable or content where people are unlikely to come after you. As much as I’m personally okay being a test case for some sort of “Yeah I didn’t read all 9000 words on the JSTOR terms and conditions, please feel free to take me to jail” case, realistically that will not happen. Realistically the real threat of jail is scary and terrible and expensive. Realistically people bend and decide it’s not so bad because they think it’s the best they can do. I think we can probably do better than that.

A good old fashioned linkdump


Public domain photograph by: US Navy, National Science Foundation. Link.

I’m back at home after meeting with a lot of terrific librarians in four different states. March is the busy month and after last month my plan is “not getting in a plane more than once a month for work.” I’ll be speaking with my good friend Michael Stephens at the Indiana Library Federation District Six conference next week. I’ll do a wrap-up of the talks I’ve been giving sometime later but news for me is mostly having more free time to actually attend things and not just speak at them. Getting to go to programs at the Tennessee Library Association conference and the National Library of Medicine’s New England Region one-day conference about social justice has really helped me connect with what other people are doing in some of the same areas I’m interested in. It’s sort of important to not just be a lone voice in the wilderness about some of this stuff, so in addition to the SXSW stuff (and talking to a great bunch of library school students in Columbia Missouri) getting to attend library events as an audience member has been a highlight of the past few weeks.

However I’ve been backed up on “stuff I read that I think other people might like to read.” Try as I may Twitter is still for hot potato stuff [i.e. Google’s April Fools Joke specifically, I felt, for librarians] and not for things that I think merit more thoughtful or wordy presentation. So, as I enter the first Thursday in over a month where I get to hang out at home all day, I’m catching up, not on reading because there is tons of time for reading while traveling, but on passing some links around. So, here are some things you might like to read, from the past few months, newest first.

Cornell removes restrictions on public domain repros

An ongoing debate in the copyright wars is whether an institution that is making reproductions of public domain materials available should be allowed to dictate terms (usually involving payment) for use of those items. We all know that libraries need money. It’s also true that having digital copies of rare materials available helps preserve the original items. So, if I want to download a public domain book from Google Books — say John Cotton Dana’s book A Library Primer — I get usage guidelines from Google attached to the pdf I’ve downloaded.

Usage guidelines
Google is proud to partner with libraries to digitize public domain materials and make them widely accessible. Public domain books belong to the public and we are merely their custodians. Nevertheless, this work is expensive, so in order to keep providing this resource, we have taken steps to prevent abuse by commercial parties, including placing technical restrictions on automated querying.

We also ask that you:
+ Make non-commercial use of the files We designed Google Book Search for use by individuals, and we request that you use these files for personal, non-commercial purposes.
+ Refrain from automated querying Do not send automated queries of any sort to Google’s system: If you are conducting research on machine translation, optical character recognition or other areas where access to a large amount of text is helpful, please contact us. We encourage the use of public domain materials for these purposes and may be able to help.
+ Maintain attribution The Google “watermark” you see on each file is essential for informing people about this project and helping them find additional materials through Google Book Search. Please do not remove it.
+ Keep it legal Whatever your use, remember that you are responsible for ensuring that what you are doing is legal. Do not assume that just because we believe a book is in the public domain for users in the United States, that the work is also in the public domain for users in other countries. Whether a book is still in copyright varies from country to country, and we can’t offer guidance on whether any specific use of any specific book is allowed. Please do not assume that a book’s appearance in Google Book Search means it can be used in any manner anywhere in the world. Copyright infringement liability can be quite severe.

These are all “suggestions” as near as I can tell. As with the Chicken Coupon fiasco of a few days ago, the implied threat that comes along with this item puts a bit of a damper on the joy that is the public domain. Bleh. We’ve seen other big corporations and libraries doing this as well.

However, this post is mostly to say “Yay” about Cornell’s decision to remove all restrictions on the use of its public domain reproductions. Here’s their press release about it and here is the web page with the new policy. What’s their reasoning? Well among other thigns it’s hard to support a misson of open access and at the same time go out of your way to make materials more difficult to get ahold of and interact with. You can see some of Cornell’s 70,000 public domain items at the Internet Archive.

Working towards more public books, fewer orphan works

Public domain determination becomes clearer cut, more books entering the public domain thanks to … Google? Jacob Kramer-Duffield explains how Google and Project Gutenberg and the Distributed Proofreaders put their book-scanning and OCR-ing smarts into trying to solve the thorny orphan works problem to determine which out of print books have had their copyrights renewed and which haven’t. Neat. [via joho]

a small foray into Google Books

You can use the date operator to browse public domain books in Google Books. I’m not entirely sure why the covers of some of these books remain under copyright. Any ideas? I’ve also noticed a few scanning errors and some pretty neat finds like this one which gives the name of every librarian in the US and Canada working in a library holding over 1,000 volumes. Google Books clearly uses keyword indexing to make these books searchable. How great would it be to have this one in a database? You can see a few images that I particularly liked over at Flickr.