Analyzing Link Topology to Quantify the Degree of Planned Obsolesce in Online Digital Humanities Projects

1. Abstract

Many of the online projects in the digital humanities have an implied planned obsolesce –which means that they will degrade over time once they cease to receive updates in their content and software libraries (Fitzpatrick 2011). We presented papers at Digital Humanities 2017, 2018, and 2019 that explored the abandonment and the average lifespan of online projects in the digital humanities (Meneses and Furuta 2017), contrasted how things have changed over the course of a year (Meneses et al. 2018), and introduced a strategy for preservation by creating standalone software executables (Meneses et al. 2019). However, managing and characterizing the degradation of online digital humanities projects is a complex and pressing problem that demands further analysis.

In this sense “planned obsolescence” is a nuanced designation —as there are many cases of successful projects in digital humanities that are shifting their focus from active development to data management (for example: http://cervantes.dh.tamu.edu). These are cases where a project’s online presence has not received updates for some time but its online tools are stable and continue to be accessed by its users. However, if updates are not applied to the infrastructure or content of a project over time web requests will eventually start generating errors on the server or the client —affecting the overall user experience (Nowviskie and Porter 2010). These are examples of why the rules for traditional resources do not fully apply and new metrics are needed to identify issues concerning online projects in the digital humanities.

In this study we dive deeper into exploring the distinctive signs of abandonment to quantify the planned obsolesce of online digital humanities projects. In our workflow, we use each project included in the Book of Abstracts that is published after each Digital Humanities conference from 2006 to 2019. We then proceed to periodically create a set of WARC files for each project, which are processed using Python (van Rossum 1995) and Apache Spark (Apache Software Foundation 2017) to statistically analyze the retrieved HTTP response codes, number of redirects, DNS metadata and detailed examination of the contents and links returned by traversing the base node. This combination of metrics and techniques has allowed us to assess the degree of change of a project over time. As one of the results from our 2019 presentation, we claimed that the most important signature for degradation comes from the assessing the validity and overall health of the topology of links in a project. Thus, the focus of our study is analyzing this key signature.

We acknowledge that research on the preservation of projects in the digital humanities is also carried out by other groups (Larrousse and Marchand 2019) (Arneil, Holmes, and Newton 2019). However, our study is different as it focuses on two points: first, identifying the signals of abandoned projects using computational methods; and second, quantifying their degree of abandonment. In the end, we intend this study to be a step forward towards better preservation strategies for the planned obsolesce of online digital humanities projects.

References

Apache Software Foundation. 2017. “Apache Spark: Lightning-Fast Cluster Computing.” April 11. http://spark.apache.org.

Arneil, Stewart, Martin Holmes, and Greg Newton. 2019. “Project Endings: Early Impressions From Our Recent Survey On Project Longevity In DH.” presented at the Digital Humanities 2019, Utrecht, The Netherlands, July 9.

Fitzpatrick, Kathleen. 2011. Planned Obsolescence: Publishing, Technology, and the Future of the Academy. New York: NYU Press.

Larrousse, Nicolas, and Joel Marchand. 2019. “A Techno-Human Mesh for Humanities in France: Dealing with Preservation Complexity.” presented at the Digital Humanities 2019, Utrecht, The Netherlands, July 9.

Meneses, Luis, and Richard Furuta. 2017. “Shelf Life: Identifying the Abandonment of Online Digital Humanities Projects.” presented at the Digital Humanities 2017, Montreal, Canada, August 8.

Meneses, Luis, Jonathan Martin, Richard Furuta, and Ray Siemens. 2018. “Part Deux: Exploring the Signs of Abandonment of Online Digital Humanities Projects.” presented at the Digital Humanities 2018, Mexico City, June 26.

———. 2019. “A Framework to Quantify the Signs of Abandonment in Online Digital Humanities Projects.” presented at the Digital Humanities 2019, Utrecht, The Netherlands, July 9.

Nowviskie, Bethany, and Dot Porter. 2010. “The Graceful Degradation Survey: Managing Digital Humanities Projects Through Times of Transition and Decline.” presented at the 2010 Conference of the Alliance of Digital Humanities Organizations. http://dh2010.cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab-722.html.

van Rossum, Guido. 1995. “Python Tutorial, Technical Report CS-R9526.” Amsterdam: Centrum voor Wikunde en Informatica (CWI). https://ir.cwi.nl/pub/5007/05007D.pdf.

Luis Meneses (ldmm@uvic.ca), Electronic Textual Cultures Lab, University of Victoria, Jonathan Martin , King’s College London, Richard Furuta , Electronic Textual Cultures Lab, University of Victoria and Ray Siemens , Center for the Study of Digital Libraries, Texas A&M University

Theme: Lux by Bootswatch.