Washington D.C. and the Work Ahead

On Greater Greater Washington, Tom MacWright recently wrote a blog entry highlighting the problems of access to the Washington D.C. Code. There is, first, a legal obstacle: Washington D.C. claims copyright over their laws, which is to say that it is illegal to reproduce them without permission of the city. Then, second, what is perhaps a more significant obstacle: they outsource the maintenance of their legal code.

The city of Washington D.C. long ago started paying WestLaw—and now LexisNexis—to turn the D.C. Council’s bills into laws. As a result, they now have neither the knowledge nor the infrastructure to maintain their own laws. The only way that D.C. can find out what their laws say is to pay LexisNexis to tell them. This is consequently true for the public, as well. If a resident of D.C.—like MacWright—wants to know what the law says, there’s no sense in asking (or FOIAing) the city, because the city has outsourced the process so completely that they know nothing.

MacWright has a few options to know what the law says. The first is to travel to a library on each occasion that he wants to know something (assuming he can find one that has a current copy of the DC Code), and read it there. The second is that he can buy a copy, for $867.00. And the third is that he can use the DC Code website, maintained by WestLaw, which is every bit as awful as any other state code website.

So how is the D.C. Code to get the State Decoded treatment? How can a digital copy be imported into the software, for the general public benefit? It can’t be FOIAed from D.C. Council, since they don’t have it. It’s clearly impractical to scan in 25 volumes of hardbound books. Normally that would leave scraping the website, but WestLaw’s website has a EULA that prohibits copying material off of the website. WestLaw has been hired to do for the Washington D.C. government what they cannot or will not do for themselves—post laws to the web—and because they choose to impose copyright restrictions, that is a legal barrier preventing that material from being reused.

Normally, this would be the end of the road—Washington D.C. would have cut off their code from being improved (or even reused) in any way by third parties. In this case, though, the story has a different ending. Public.Resource.Org has taken the surprising and admirable tack of purchasing all of the volumes of the D.C. Code, slicing them up, scanning them in, OCRing them, and distributing them for free.

DC Code Assembly Line

Deglued bundles on the bottom right awaiting their turn on the two high-speed scanners. When they are completed, the bundles are put upper right to await a QA pass.

All of the volumes can be downloaded as PDFs or, via the Internet Archive, in nearly any other file format one can think of. For those who would like a print copy for a more reasonable price, print-on-demand service Lulu sells each volume for just $12.

Lest the motivations of Public.Resource.Org be unclear, a “Proclamation of Digitization” accompanies the release, citing a pair of Supreme Court rulings (“the authentic exposition and interpretation of the law, which, binding every citizen, is free for publication to all, whether it is a declaration of unwritten law, or an interpretation of a constitution or a statute”) and declaring that “any assertion of copyright by the District of Columbia or other parties on the District of Columbia Code is declared to be NULL AND VOID as a matter of law and public policy as it is the right of every person to read, know, and speak the laws that bind them.” The organization mailed out elaborate packages, containing portions of the D.C. Code, to announce its availability. (You can see my own unboxing photos.)

All that remains is for somebody to marry this source of data with The State Decoded as, indeed, somebody is already talking about doing. The D.C. Council or WestLaw may not be happy about this—it’s quite possible that one or both entities will take legal action to halt this—but I’m confident that it will be found that the law supports making those very laws public.

No Love from LexisNexis

This is what the table of contents looks like in LexisNexis’s printed edition of the Code of Virginia:

Lexis Table of Contents

I was a bit stunned the first time I saw this. It’s just word soup. There’s simply no effort to make it legible. No thought has gone into this. There is, in short, no love.

I feel like, as a culture, we basically understand how to make tables of contents. Right? Grabbing a few books off my desk, more or less at random, I thought I’d compare Lexis’s table of contents to those of others. Here’s The Chicago Manual of Style:

Chicago Table of Contents

Designing with Web Standards, by Jeffrey Zeldman and Ethan Marcotte:

Zeldman Table of Contents

And Robert Bringhurst’s The Elements of Typographic Style:

Bringhurst Table of Contents

These are all different, but via various small design cues they all manage to accomplish the same thing: they make it easy for somebody to browse through the contents of the text and locate the specific section that they need. Microsoft Word, right out of the box, will happily render a table of contents in styles reminiscent of all of these, with minimal effort.

LexisNexis isn’t even trying. I can’t pretend to know why. But with this as the current state of affairs in the presentation of legal information, it’s trivial for The State Decoded—or anybody with a copy of Word—to improve upon it.

Typeface Authority

With the design process for The State Decoded underway, we’re putting a lot of thought into typography. Helpful to this process has been both Ruth Anne Robbins’ “Painting with print: Incorporating concepts of typographic and layout design into the text of legal writing documents” and Derek H. Kiernan-Johnson’s “Telling Through Type: Typography and Narrative in Legal Briefs.”

Both of those papers are conceptual in nature, so they’re complemented nicely by Errol Morris’ two-part series [1, 2] about the results of a quiz that he ran on the New York Times website, ostensibly measuring readers’ optimism. In fact, he was measuring the impact of different typefaces on readers’ responses. Those who doubt that a typeface could have much of an impact on the credulity of a reader should consider the effect of Comic Sans, which Morris discovered (unsurprisingly) correlated strongly with incredulity on the part of readers. Of the six typefaces that he tested (Baskerville, Comic Sans, Computer Modern, Georgia, Helvetica, and Trebuchet), Baskerville proved the most persuasive. The effect was small, but significant.

This is the sort of consideration that is clearly lacking in the present rendering of laws, both online and in print. (Typographically, LexisNexis’s printed state codes are a train wreck.) It’s also precisely the consideration that will set apart those sites based on The State Decoded, or anybody who cares to employ the project’s stylesheets. There will be more news about this ongoing design work in the weeks ahead.

Answering Legal Questions with Google

Throughout the planning process for the State Decoded project, I have made the basic assumption that the primary source of traffic for implementations of the software would be from search engines. People typing in things like “‘following too closely’ virginia,” “boundary law in kentucky,” or “grand larceny illinois bad checks,” who would be led directly to the law that in question, presented within a context that would make that law understandable to them. This usage pattern is one of the major concepts behind the project.

Nine weeks after launching the Virginia website, it’s been indexed by Google thoroughly, though it has few enough third-party links that it has a PageRank of just 4 (out of 10). In the past week, Virginia Decoded has had 458 keyword-bearing referrers from Google. Not one of those search phrases has been used more than 7 times. 4 of them have been used 3–7 times. 14 of them have been used 2 times. The remaining 440 were used just 1 time. This is a very flat distribution. I’d call it a “long tail” of search results, but it’s all tail—basically a snake.

Many of these search terms are extremely specific (e.g., “what does it mean when an employee returns to his position on an active employment basis for 45 consecutive calendar days or longer any succeeding period of disability shall constitute a new period of short term disability”), and dozens are for specific sections of the Code (e.g., “18.2-460″). Many appear to be people trying to solve problems (e.g. “transfer an inspection sticker to a new windshiled [sic],” “virginia code failure to file tax,” “what is the penalty for a failure to appear in va”).

Some of these search terms return results that would not otherwise yield useful results from the official Code of Virginia website. For instance, “why va. code 18.2-361(a) is unconstitutional” returns Virginia Decoded’s § 18.2-361 record because it includes all court decisions that cite § 18.2-361, notably William Scott McDonald, a/k/a William Scott MacDonald v. Commonwealth, which has an court-provided abstract that reads:

This Court finds that Code Section 18.2-361(A) is constitutional as applied to appellant because his violations involved minors and merit no protection under the Due Process Clause of the Fourteenth Amendment; appellants convictions of four counts of sodomy are affirmed

§ 18.2-361 is still on the books, and somebody looking at the law on the state’s official website would have no way of knowing that portions of it have been judged unconstitutional by the Virginia Court of Appeals. By putting court decisions on the same page as the law, Google inferred that they are related, indexed all of the content together, and returned this far-more-useful result to somebody posing a basic question about the law’s constitutionality.

People referred to Virginia Decoded via a Google search stick around for a minute and a half, which isn’t brilliant, but it’s not too shabby, either. They look at an average of 2.68 pages, which is also decent. With just 458 keyword-bearing Google referrers in the past week, though, there’s clearly a lot of room for improvement in overall referrers, something that will be helped with the passage of time, as more sites link to Virginia Decoded and its PageRank climbs.

The plan here was to turn entire state codes into enormous targets for search traffic to help people solve problems and better understand the laws that govern them. Traffic records bear out that at least the former half of that plan is being fulfilled. That accomplished, I can concentrate more on the latter, which was always going to be the real work.

Outsourcing Online Code Display Hinders Innovation

Although most states provide a copy of their laws online, some outsource this to LexisNexis. Arkansas, Colorado, Georgia, Mississippi, Tennessee, and Washington D.C. all do so. While this might seem like a decent solution at first blush, it’s actually incredibly problematic, and serves as a major obstacle to innovation within those states.

It is self-evident that state laws ought to be disseminated as widely as possible and be as accessible as possible. To follow the law, people must first be able to know what it says. Projects like the State Decoded (or Legix.info’s California Codes, or OregonLaws.org, or Justia’s US Law directory) rely on access to the text of the law. These services take the raw material of the laws and make substantial improvements to them, making this important information more accessible and understandable than they are on their state-sanctioned websites.

When states punt to LexisNexis, they make their state codes a dead end.

Washington D.C. used to provide their code on dccouncil.washington.dc.us. No longer. Now it’s found only on LexisNexis’s website. Any D.C. resident who wants to read the code—their own laws—must first agree to LexisNexis’s terms of use, which allow visitors merely “the right to download using the commands of the Online Services and store in machine-readable form, primarily for that Authorized User’s exclusive use, a single copy of insubstantial portions of those Authorized Legal Materials.” LexisNexis goes on to explain that, “for the avoidance of doubt, downloading and storing Materials in an archival database is prohibited.” LexisNexis’s terms of use make it impossible to do anything of interest with D.C.’s code. No value can be added to it. The strangely specific prohibition on storing data in a database (e.g., Zotero) ensures that.

The problem is not LexisNexis per se, but rather their strikingly restrictive licensing terms of materials that, were it not for those terms, could be reproduced freely.

Unless Arkansas, Colorado, Georgia, Mississippi, Tennessee, and Washington D.C. provide bulk downloads—which is rare—or can be persuaded to provide an electronic copy of their laws, LexisNexis’s licensing terms are an immovable object that prevents the advance of any private-sector effort to enhance the display of those laws.