The Role of Libraries in the Digital Age

Oct. 3, 2023 [technology]

Municipal libraries have been given special exemption to operate outside of typical copywrong framework through a lending right scheme. And I welcome any chink in the armor of the monopolistic copywrong industry, even if it happens to be done so through the means of state sanctioned robbery. The earliest libraries were vehemently opposed by publishers who thought it as a way of undercutting their profits. And yet, centuries later there is still a thriving industry for books to be authored. It is as though books are just physical text files and the same intrinsic workings found in file sharing (piracy, according to mid wits) similarly occurs through library lending.

The paradigm that saw public libraries into existence has evaporated in the wake of the new digital age and there is new discourse raging over the lending of digital copies of books… which I suggest is unimportant under the shadow of a larger failing of today’s libraries. In an age when most textual material is now consumed over HTTP, they have neglected to gather collections of web documents of any variety. Why is it that an exceedingly small number of good samaritan organizations have found themselves the proverbial eggs in a basket when it comes to web archival? Accounting for every substantially recognized effort, we basically have only the Internet Archive and Archive.Today to look to for page preservation.

And with so few outfits doing the world’s heavy lifting, it has been made simple to pressure these good samaritans [1] into memory hole’ing inconvenient parts of their archives. The Wayback Machine and, to some extent, Archive.Today are an effective duopoly over cached historic copies of large swaths of the web. This crucial role has failed to be spun up in a decentralized way, even though the tools to make this possible already exist. Programs such as HTTrack, or even the built-in page caching functionality of self hostable web crawlers such as YaCy.

As libraries nearly universally provide internet access for visitors, it is not a far cry to imagine that these institutions could also run, however rudimentary, web indexing. Without digging too deeply, I suspect this lack of foresight comes down to their lazily outsourcing of digital infrastructure to remote managed services. Or to the cultural atmospherics that naturally arise from the disproportionate hiring of people who fundamentally don’t understand digital technology. All such a solution would need is some of the readily available bandwidth, a dedicated server and exposing of the cached pages in a searchable way to, at the very least, systems on the local network.

But what would an ideal situation look like? How about, for example, being able to navigate to any one of the many municipal library websites at such a resource like “webarchive.stchlibrary.org” (St. Charles County Library) to check whether there are any captures of a desired page. The redundancy of so many organizations crawling the web would make chronological captures all the more difficult to censor. And the workload involved with setting up indexing is no excuse not to do so. In fact, I would argue that it is more work to meticulously catalogue and preserve thousands of microfiche slides of decades of newspapers, which so many libraries already do. It is a shame given the criticality of the role the web assumes in contemporary society.