All is Fair in Google Books

This article was co-written by Mo Lotman.

On the Internet, the phrase “fair use” is often thrown around as an excuse to do anything you want. Want to put a well-known song in your YouTube video? Just say it’s “fair use,” and maybe YouTube won’t take it down. Pulling photos from news sites for your own page? Totally “fair use” if you’re not profiting from it, right? Downloading music illegally via BitTorrent? Yeah, that’s “fair use” if you’re just listening to it at home. Probably.

Unsurprisingly, very little of what’s posted online under the claim of “fair use” of a copyrighted work is actually considered fair use under the law.

The concept of fair use is both extremely old and relatively modern. The British have observed the concept of “fair abridgment,” which allows for the use of copyrighted material in certain written accounts and parody, for hundreds of years.

Fair use in the United States began to take shape with the 1841 case Folsom v. Marsh, but it wasn’t actually codified until the Copyright Act of 1976. Under 17 U.S.C. § 107, courts must balance four factors when determining whether an author has made fair use of a copyrighted work:

• How the work was used, including whether there was a commercial or nonprofit educational purpose. This also means considering how transformative of the original work the use is; in other words, how much was the old work added to.

• The nature of the original copyrighted work.

• How much of the copyrighted work was used in the new work.

• Whether the supposed fair use could hurt the market for, or value of, the copyrighted work.

In practice it works like this: let’s say you’re writing a novel or an essay and you want to use three sentences from the middle of Harry Potter and the Sorcerer’s Stone to make a point about adolescence and heroism.

That’s almost certainly a fair use that no one is going to challenge. But if you want to lift the first 100 pages of the book and then dive into some Harry Potter fan-fiction you’re looking to profit from, some very angry lawyers from Scholastic will almost certainly zap your literary dreams like He-Who-Must-Not-Be-Named.

When Copying Everything Is Somehow Still Fair Use
So what if you copy 25 million books onto your own servers and make much, if not all, of the information in those books freely available … to everyone in the world?

That really doesn’t sound like “fair use” yet that’s exactly what Google has been doing through its Google Books program for more than a decade now, and somehow courts have, time and again, ruled that the software giant has broken no copyright laws.

Google has said flat-out that eventually it would like to scan every book ever published into Google Books, regardless of copyright status. The way the software works right now, and the way that Google has thus far avoided a massive court judgment, is by bending over backwards to squeeze through the fair use exception.

If you search for a word or phrase on Google Books, you’ll see results from hundreds, if not thousands, of books. If the book is in the public domain, you’ll see the phrase you searched for, and you’re free to look through the rest of the book too. No one really objects to this. After all, the copyright has expired.

If the book is still under copyright, and the copyright holder has given permission, you’ll see a preview of the book. That means you’ll see your search terms and a certain percentage of the book that the publisher has agreed to make public. This is somewhat controversial, but at least the copyright holder has some say in how the work is being used.

But if a copyright holder refuses to grant Google permission to include a book, Google still scans the book into its online library, and if someone searches for a term in the book, Google Books will display what it calls “snippets” of it. These snippets are made up of the term itself and two or three lines that surround it.

Google Books will only display three snippets at a time if a term shows up repeatedly in a book, so even if you keep searching different terms, it’s going to take a very, very long time to read an entire copyrighted book this way. Still, making any part of these protected works public without the permission of the authors has understandably upset many writers (a group already well-known for their surliness).

Authors Guild, Inc. v. Google
The Google Books project began in late 2004, and within months the Authors Guild of America and the Association of American Publishers filed a joint lawsuit alleging massive copyright infringement.

Legal battles rarely end quickly, but the Google Books case dragged on for an epically long time even for a complex litigation. In 2008, the parties proposed a settlement that would have created a $125 million Book Rights Registry. The registry would have paid authors of out-of-print yet copyrighted books a nominal fee and a portion of advertising revenue generated by searches in which their works showed up, but authors would have to register with Google to get this money, and agree to the company’s terms for making a preview of the book available.

A New York federal judge rejected this settlement and an amended agreement on the grounds that it was too one-sided in favor of Google and put too much of a burden on copyright holders to protect their works. After three more years of litigation, Google and the Association of American Publishers reached a settlement, although the exact terms of this agreement have never been made public.

The Authors Guild of America continued with its lawsuit until 2013 when the court, reviewing the case under the fair use provisions of the Copyright Act of 1976, found that Google was breaking no laws and, if anything, was improving book sales and access to information. The case was ultimately dismissed.

An appeal to the Second Circuit followed, but in 2015 the appellate court also sided with Google. The opinion, written by respected copyright authority Judge Pierre Leval held that taking the original texts and adding search functionality to create snippets was transformative enough to be a fair use. The court found as well that making only snippets of the text available was sufficient to protect the rights of copyright holders.

The Authors Guild of course disagreed. But once a case has been decided by a federal appeals court, the only other place it can be heard is the US Supreme Court, which is very selective about what cases it hears. The Guild filed a petition for writ of certiorari with the US Supreme Court asking the justices to determine whether the Second Circuit put too much emphasis on the transformative work factor of the fair use test, and whether fair use requires the creation of “new expression, meaning, or message.”

library_istock_nyiragongo_900 / nyiragongo

On April 18th of this year, the Supreme Court issued an order declining to hear the case, effectively making the Second Circuit’s decision the law of the land, and giving companies free rein to continue to pursue projects such as Google Books with complete disregard for the concerns of copyright holders.

Transformative, Maybe, Work, No
Unfortunately, copyright holders are not the only victims in this case. The truth is, that this ruling breaks faith with the public at large because it allows the wholesale appropriation of an entire written culture by a private, profit-seeking company. The problem stems from two issues. One is the court’s rather strange idea of a transformative work. The second is the failure to account for the qualitative difference between how digital tools and massive data sets can repackage information, and the analog paper directories of the past.

The idea that a database of any kind—which is what Google Books is—constitutes a “transformative work” implies that it is somehow a work of art, or in itself a cultural artifact worthy of protection. But a database is just a glorified index. The indices of the past were meant to point you to original works that might have information you’re looking for. Helpful indeed, but hardly artistic expressions.

The Copyright Alliance is an advocacy group working on behalf of copyright holders (including artist guilds, publishing groups, unions, and corporations). Their CEO Keith Kupferschmid said it quite well in a post on their website following the Supreme Court’s decision to decline hearing the appeal: “In cases like Google Books there is no new expressive work that results from the use. Rather, the works are being used to create a service that uses information about the works that is culled from the copying of them. Since there is no new expressive work produced as a result of the copying, the transformative use test is irrelevant.”

Furthermore, historically, indices neither included nor presented to you the works in their entirety. For that, you would need to consult the original, where you would encounter the author’s context in its totality (possibly then reaching a better understanding or appreciation of their work). And to do that, you’d have to either purchase a copy—thereby compensating the author, the publisher, and a bookstore—or you’d have to go to a library, which itself purchased the work. More importantly, a library is a shared public repository of cultural history.

The salient words there are shared and public. Works out of copyright go into something we’ve long called the public domain. It’s not called the Google domain. What Google has done has been to take an unimaginably massive piece of our shared literary heritage—approximately 25% (so far) of all books ever published, if you believe Google’s own estimates— and claimed it as its own. Even if this effort had been undertaken on behalf of governments, so that we could all share ownership of this information, there would still be questions regarding copyright, compensation, and context. But Google didn’t hand over the project to the government, and they didn’t do it out of a selfless interest in preserving history or culture. They did it to make a lot of money. Every search on Google Books is a search during which Google is tracking your actions, monetizing your attention, and enriching their brand.

What to Do
There’s a wider issue here of how poorly the legal system is equipped to deal with changes in technology, and another of how well the staid judiciary can understand these concepts, a common criticism of the Supreme Court. It might take brand new laws or more tech-savvy judges, to chart an appropriate new course.

In this case, technology has changed the ability of companies to copy and repackage content, in a way completely unforeseen by 18th-century jurists who originated our ideas about fair use. One way to account for this change is by creating a firmer distinction between functional use and transformative use. Functional use describes efforts like Google Books to repackage and index copyrighted content without creating a new expressive work. Regrettably, recent jurisprudence has blurred this distinction, as judges are lost in thralldom to the gadgetry of the internet, according to Penn State Press Director Emeritus Sanford Thatcher. Writing in 2009, he saw a stream of decisions from the 9th Circuit as setting dubious precedents that presaged the Authors Guild v. Google decision. “Intellectually, these rulings stretch the natural meaning of ‘transformative’ well beyond the bounds of common sense.”

Much of the current understanding of fair use seems to stem from the 1994 case Campbell v. Acuff-Rose Publishing, which might seem funny to those old enough to remember the Campbell in question: Luther Campbell, the lead singer of the juvenile rap group 2 Live Crew. (Incidentally, members of the group were also arrested on and acquitted of obscenity charges, another landmark case of sorts.) Campbell was sued for sampling, without permission, Roy Orbison’s “Pretty Woman,” in order to make a crude parody. The court allowed it, citing parody as a “transformative use,” and referencing an influential article by none other than Judge Leval, two decades before he’d blow up his own reasoning.

Clearly, this type of transformative work (meager as it is) differs greatly from Google Books. The Copyright Alliance filed an amicus brief on behalf of the Authors Guild, in which they explain that difference succinctly: “Campbell arose in the context of a one-time use of an original song to create a parody—not, as here, a commercial business built on the systematic exploitation of copyrighted works created by others.”

The Alliance also pointed to an additional harm to authors not immediately obvious: “Google’s massive, unlicensed, unrestricted library supplants the creation of a licensed digital library governed by the terms that the books’ rightsholders would have desired. Thus, the harm to rightsholders includes not only lost licensing fees, but also the ability to control the terms and conditions of any such license, including data security measures.”

Whether a new functional-use test gets clarified or new laws arise to account for new technology, the application of copyright law must adapt. While the law remains murky, the effect of the Google Books ruling is very clear. It is abetting an ongoing transfer of the entirety of our literary cultural heritage from the public commons to a private company. It’s audacious, unprecedented, and obscene. It is not fair use. It is theft on an unimaginable scale.