The place of models and modelling in Digital Humanities: Intersections with a Research Software Engineering perspective

1. Abstract

This paper¹ aims to bridge Digital Humanities (DH) and Research Software Engineering (RSE) communities. It argues that the production of models is the core contribution of RSE to the epistemology of DH. We adopt an inclusive definition of models and modelling (see Ciula et al. 2018) which spans the whole range from ‘deformative’ to empirical modelling (see Smithies 2017: 168), including formal or predictive modelling (Joslyn and Turchin 1993), and the technical solutions produced in the process as well as the know-how, languages and documentation which accompany this production. From this wide perspective, models are also artefacts which can be studied across the history of science and of the humanities tradition (see Bod 2018) and in comparison with other modelling practices in science. RSE practice is grounded on a strong conscience of the experimental apparatus and the iterative critique of models built (and often deflated) for a purpose. The challenge is to recognise the idiosyncrasy and situatedness of modelling practices and artefacts while devising methods to expose the scalability of the underlying workflows and modelling processes.

We reflect on the epistemology of DH from the practical perspective of our RSE lab - King’s Digital Lab (KDL)² - and the research processes embedded in our Software Development Lifecycle (fig.1). The human element is at the core of the technical ecosystem we research and operate in. We acknowledge that KDL models and modelling are co-constitutive of human expertise, technical systems and operational methods, all aspiring to an environment conducive of open knowledge. This is not only in terms of development and management approaches via adoption of open standards, open source and exposure of open data but also, more fundamentally, in sharing (achievements and struggles around) our processes and in promoting open models. We will use project-specific examples, including data modelling and knowledge representation practices, to demonstrate how some of our research into model-making processes challenge the perception of the technical work of RSE within DH as a stale, mechanistic and uncritical procedural activity.

Figure 1 KDL Software Development Lifecycle (SDLC) (King’s Digital Lab 2019) mapped to Agile DSDM project phases (Agile Business Consortium 2014).

KDL operates within a rather unique context. It claims its origins in the pioneering work of colleagues at King’s College London working in applied computing in the Humanities already in the 1970s (see Short et al. 2012). However, the crossings between RSE and DH communities at King’s and internationally are only recently being highlighted and explored (e.g. Gold 2009; Smithies 2019; DHTech Group 2019). We argue that the study and building of models is one of the dimensions via which these crossings emerge more vividly with substantial epistemological implications and innovative ramifications for the field of DH as a whole.

The core practice of research in DH is modelling (e.g. cf. McCarty 2005: 20–72; Buzzetti 2002; Beynon and McCarty 2006; Flanders and Jannidis 2015, 2018), which implies the translation of complex systems of knowledge or conceptual frameworks into computationally processable models or operational frameworks.

Gooding (2003) claims that in experimental settings computational approaches are analogues to other processes of abstraction, measurement and contextual interpretation, whereby reduction of complexity is followed by expansion in the guise of a double funnel-shaped process (Gooding 2003: fig 13.4). We can trace these processes of reduction and expansion also in the RSE context, where, for example, operationalisation makes models formalised into snippets of code or software components. Other languages than translation into code play a role in the process, however.³

The paper will reflect on models as artefacts of different kind expressed via a variety of languages, including but not limited to computational models, produced during several phases of the SDLC (see Ciula and Smithies, forthcoming) such as:

negotiations around the meaning of the project units of analysis documented in diagrams and definitions which shape requirements and an agreed project language;
paper or whiteboard sketches used to draft the solution architecture for a project in its feasibility assessment;
wireframes and static mockups of user journeys;
data models implementing the logical structure of a database;
statistical models implemented with ad hoc algorithms and code or relying on tested formulas and existing libraries.

While specific instantiations of models (for example in relational databases) can have a rather short life, in the RSE context and projects where they were designed and developed, they often represented innovative solutions which have a longstanding effect. Indeed, while models are temporary pragmatic solutions to address specific project challenges yet, at the scale KDL operates, they are the backbone of the team's tacit knowledge as well as the building blocks towards more generalisable and re-usable approaches. They mediate and bridge the layers of the Lab’s socio-technical system: team expertise, data and technical systems (fig. 2).

Figure 2 Multilayered socio-technical system of the Lab where concentric circles denote co-constitution of team expertise, data and technical systems (Ciula and Smithies, forthcoming: fig. 5).

While we can refine existing approaches to sustain and expose modelling efforts in SDLC cycles which rely on RSE best practices such as attention to documentation and re-use,⁴ there is also room for more innovative approaches, including a design first culture, workflow integration across RSE roles, the assessment and potential adoption of a set of modelling notation languages for the exposure mechanisms for our models.

In alignment with a “critical modelling” approach (see Bode 2020), but also with a material culture and media literacy perspective, in this paper we aim to reflect upon models by looking at the wide epistemological implications of their production and use, at the responsibilities of modellers,⁵ at how models come to be and what effect they have in the resources they contribute to instantiate and hence in interpretative processes of expansion as well as reduction.

Notes

[1] A preliminary version of this paper was presented at the symposium Computational Text Analysis and Historical Change, held at Humlab, Umeå University (Sweden), 4-6 September 2019.

[2] The authors currently cover different roles at King’s Digital Lab (KDL), a Research Software Engineering (RSE) team hosted by the Faculty of Arts & Humanities at King’s College London, which provides software development and infrastructure to departments in the Faculty of Arts & Humanities while also collaborating with Social Science & Public Policy as well as a range of external partners in the higher education and cultural heritage sectors.

[3] Data modelling, for example, is informed if not driven by communication and collaborative reasoning around more or less standardised graphical representations and notations in phases of reverse engineering as well as design methods. Note that KDL intend and use design methods in a wide sense ranging from techniques of requirements elicitation in pre-project analysis to data modelling and wireframing in evolutionary development (see Bennet et al. 2005). Equally, the re-integration or expansion of modelling efforts into interpretative frameworks usually rely on verbal and visual language to document code, or to explain the results of an experiment (Ciula and Marras 2019: 39).

[4] With this respect see, for example, the KDL checklist for assessment of digital outputs within the UK Research Framework Exercise (Ciula 2019).

[5] Models contribute to define and redefine objects of study which come charged with layers of scholarship and analysis, with previous selections, bias and political as well as ethical responsibilities. As the creators of new memory regimes and intermediaries to the past engaged in modelling efforts which interact and affect the materiality of our objects of study (Ciula 2017a), we bear responsibilities (Ciula 2017b). In line with ongoing discussions around the representativeness and constraints of the digital archive (e.g. Dahlström 2010; Hitchcock 2013; Prescott 2015), Bode (2018, 2020) presented some lucid analysis around modellers’ responsibilities in digital literary studies by exposing the gaps that propagate from produced literary works we know of to material preserved in the analogue archive, to selections of works that make it into digital archives to further reductions in the creation of a corpus of analysis and, last but not least, in the application of statistical modelling techniques which dictate additional powerful yet limiting constraints if not contextualised critically within an interlocked chain of bias.

References

Agile Business Consortium. The DSDM Agile Project Framework Handbook. DSDM Consortium, 2014. https://www.agilebusiness.org/resources/dsdm-handbooks.
Bennett, Simon, Steve McRobb, and Ray Farmer. Object-Oriented Systems Analysis and Design Using UML. McGraw Hill Higher Education, 2005.
Beynon, Meurig, Steve Russ, and Willard McCarty. ‘Human Computing—Modelling with Meaning’. Literary and Linguistic Computing 21, no. 2 (6 January 2006): 141–57. https://doi.org/10.1093/llc/fql015.
Bod, Rens. ‘Modelling in the Humanities: Linking Patterns to Principles’. Historical Social Research, Supplement, no. 31 (2018): 78–95. https://doi.org/10.12759/hsr.suppl.31.2018.78-95.
Bode, Katherine. A World of Fiction: Digital Collections and the Future of Literary History. University of Michigan Press, 2018.
———. ‘Why You Can’t Model Away Bias’. Modern Language Quarterly 81, no. 1 (1 March 2020): 95–124. https://doi.org/10.1215/00267929-7933102. preprint: https://katherinebode.files.wordpress.com/2019/08/mlq2019_preprintbode_why.pdf.
Buzzetti, Dino. ‘Digital Representation and the Text Model’. New Literary History 33, no. 1 (2002): 61–88. https://doi.org/10.1353/nlh.2002.0003.
Ciula, Arianna. ‘Digital Humanities and Practical Memory: Modelling Textuality’. Estudos Em Comunicação 25 (21 December 2017): 7–17.
———. ‘Modelling Textuality: A Material Culture Framework’. In Advances in Digital Scholarly Editing. Papers Presented at the DiXiT Conferences in The Hague, Cologne, and Antwerp, edited by Edited by Peter Boot, Anna Cappellotto, Wout Dillen, Franz Fischer, Aodhán Kelly, Andreas Mertgens, Anna-Maria Sichani, and Elena Spadini & Dirk van Hulle, 91–97. Leiden: Sidestone Press, 2017. https://www.sidestone.com/books/advances-in-digital-scholarly-editing.
———. ‘KDL Checklist for Digital Outputs Assessment (Version 2.0)’. Zenodo, 2019. http://doi.org/10.5281/zenodo.3361580.
Ciula, Arianna, Øyvind Eide, Cristina Marras, and Patrick Sahle, eds. ‘Models and Modelling between Digital & Humanities: A Multidisciplinary Perspective’. Historical Social Research Supplement, no. 31 (2018). https://doi.org/10.12759/hsr.43.2018.4.343-361.
Ciula, Arianna, and Cristina Marras. ‘Exploring a Semiotic Conceptualisation of Modelling in Digital Humanities Practices’. In Meanings & Co: The Interdisciplinarity of Communication, Semiotics and Multimodality, edited by Alin Olteanu, Andrew Stables, and Dimitru Bortun, 6:33–52. Numanities - Arts and Humanities in Progress. Springer International Publishing, 2019. https://www.springer.com/us/book/9783319919850.
Ciula, Arianna, and James Smithies. ‘Sustainability and Modelling at King’s Digital Lab: Between Tradition and Innovation’. In On Making in the Digital Humanities: Essays on the Scholarship of Digital Humanities Development in Honour of John Bradley, edited by Julianne Nyhan, Geoffroy Rockwell and Stefan Sinclair. London: University College London Press, Forthcoming.
Dahlström, Mats. ‘Critical Editing and Critical Digitisation’. In Text Comparison and Digital Creativity: The Production of Presence and Meaning in Digital Text Scholarship, edited by Wido van Peursen, Ernst D. Thoutenhoofd, and Adriaan van der Weel, 79–97. Brill Academic Pub, 2010.
DHTech Group. ‘DH Research Software Engineers - For We Are Many’. DHTech, 2019. https://dh-tech.github.io/dhrse-whitepaper/.
Flanders, Julia, and Fotis Jannidis. ‘Knowledge Organization and Data Modeling in the Humanities’. White paper, 2015. http://www.wwp.northeastern.edu/outreach/conference/kodm2012/flanders_jannidis_datamodeling.pdf.
———, eds. The Shape of Data in Digital Humanities: Modeling Texts and Text-Based Resources. Digital Research in the Arts and Humanities. London: Routledge, Taylor & Francis Group, 2018.
Gold, Nicolas. ‘Service-Oriented Software in the Humanities: A Software Engineering Perspective’. Digital Humanities Quarterly, no. 3.4 (2009). http://digitalhumanities.org:8081/dhq/vol/3/4/000072/000072.html.
Gooding, David. ‘Varying the Cognitive Span: Experimentation, Visualization, And Computation’. In The Philosophy of Scientific Experimentation, edited by Hans Radder, 255–84. Pittsburgh, Pa.: University of Pittsburgh Press, 2003.
Hitchcock, Tim. ‘Confronting the Digital’. Cultural and Social History 10, no. 1 (1 March 2013): 9–23. https://doi.org/10.2752/147800413X13515292098070.
Joslyn, Cliff A., and Valentin Turchin. ‘Model’. In Principia Cybernetica Web, edited by Francis Heylighen, Cliff A. Joslyn, and Valentin Turchin. Brussels, 1993. http://pespmc1.vub.ac.be/MODEL.html.
King’s Digital Lab. ‘Frequently Asked Questions for Project Partners | King’s Digital Lab’. King’s Digital Lab, 2019. https://www.kdl.kcl.ac.uk/how-we-work/faq-partners/.
———. ‘What Is KDL | King’s Digital Lab’. King’s Digital Lab, 2018. https://www.kdl.kcl.ac.uk/how-we-work/what-is-kdl/.
McCarty, Willard. Humanities Computing. Basingstoke [England]; New York: Palgrave Macmillan, 2005.
Prescott, Andrew. ‘I’d Rather Be a Librarian: A Response to Tim Hitchcock, “Confronting the Digital”’. Cultural and Social History 11, no. 3 (1 September 2014): 335–41. https://doi.org/10.2752/147800414X13983595303192.
Short, Harold, Julianne Nyhan, Anne Welsh, and Jessica Salmon. ‘“Collaboration Must Be Fundamental or It’s Not Going to Work”: An Oral History Conversation between Harold Short and Julianne Nyhan’. Digital Humanities Quarterly 6, no. 3 (2012). http://www.digitalhumanities.org/dhq/vol/6/3/000133/000133.html.
Smithies, James. ‘The Continuum Approach to Career Development: Research Software Careers in King’s Digital Lab’. King’s Digital Lab - Thoughts and Reflections from the Lab (blog), 7 February 2019. https://www.kdl.kcl.ac.uk/blog/rse-career-development/.
———. The Digital Humanities and the Digital Modern. Basingstoke: Palgrave Macmillan, 2017.