The place of models and modelling in Digital Humanities Intersections with a Research Software Engineering perspective

1. Abstract

This paper1 aims to bridge Digital Humanities (DH) and Research Software Engineering (RSE) communities. It argues that the production of models is the core contribution of RSE to the epistemology of DH. We adopt an inclusive definition of models and modelling (see Ciula et al. 2018) which spans the whole range from ‘deformative’ to empirical modelling (see Smithies 2017: 168), including formal or predictive modelling (Joslyn and Turchin 1993), and the technical solutions produced in the process as well as the know-how, languages and documentation which accompany this production. From this wide perspective, models are also artefacts which can be studied across the history of science and of the humanities tradition (see Bod 2018) and in comparison with other modelling practices in science. RSE practice is grounded on a strong conscience of the experimental apparatus and the iterative critique of models built (and often deflated) for a purpose. The challenge is to recognise the idiosyncrasy and situatedness of modelling practices and artefacts while devising methods to expose the scalability of the underlying workflows and modelling processes.

We reflect on the epistemology of DH from the practical perspective of our RSE lab - King’s Digital Lab (KDL)2 - and the research processes embedded in our Software Development Lifecycle (fig.1). The human element is at the core of the technical ecosystem we research and operate in. We acknowledge that KDL models and modelling are co-constitutive of human expertise, technical systems and operational methods, all aspiring to an environment conducive of open knowledge. This is not only in terms of development and management approaches via adoption of open standards, open source and exposure of open data but also, more fundamentally, in sharing (achievements and struggles around) our processes and in promoting open models. We will use project-specific examples, including data modelling and knowledge representation practices, to demonstrate how some of our research into model-making processes challenge the perception of the technical work of RSE within DH as a stale, mechanistic and uncritical procedural activity.

Figure 1 KDL Software Development Lifecycle (SDLC) (King’s Digital Lab 2019) mapped to Agile DSDM project phases (Agile Business Consortium 2014).

KDL operates within a rather unique context. It claims its origins in the pioneering work of colleagues at King’s College London working in applied computing in the Humanities already in the 1970s (see Short et al. 2012). However, the crossings between RSE and DH communities at King’s and internationally are only recently being highlighted and explored (e.g. Gold 2009; Smithies 2019; DHTech Group 2019). We argue that the study and building of models is one of the dimensions via which these crossings emerge more vividly with substantial epistemological implications and innovative ramifications for the field of DH as a whole.

The core practice of research in DH is modelling (e.g. cf. McCarty 2005: 20–72; Buzzetti 2002; Beynon and McCarty 2006; Flanders and Jannidis 2015, 2018), which implies the translation of complex systems of knowledge or conceptual frameworks into computationally processable models or operational frameworks.

Gooding (2003) claims that in experimental settings computational approaches are analogues to other processes of abstraction, measurement and contextual interpretation, whereby reduction of complexity is followed by expansion in the guise of a double funnel-shaped process (Gooding 2003: fig 13.4). We can trace these processes of reduction and expansion also in the RSE context, where, for example, operationalisation makes models formalised into snippets of code or software components. Other languages than translation into code play a role in the process, however.3

The paper will reflect on models as artefacts of different kind expressed via a variety of languages, including but not limited to computational models, produced during several phases of the SDLC (see Ciula and Smithies, forthcoming) such as:

While specific instantiations of models (for example in relational databases) can have a rather short life, in the RSE context and projects where they were designed and developed, they often represented innovative solutions which have a longstanding effect. Indeed, while models are temporary pragmatic solutions to address specific project challenges yet, at the scale KDL operates, they are the backbone of the team's tacit knowledge as well as the building blocks towards more generalisable and re-usable approaches. They mediate and bridge the layers of the Lab’s socio-technical system: team expertise, data and technical systems (fig. 2).

Figure 2 Multilayered socio-technical system of the Lab where concentric circles denote co-constitution of team expertise, data and technical systems (Ciula and Smithies, forthcoming: fig. 5).

While we can refine existing approaches to sustain and expose modelling efforts in SDLC cycles which rely on RSE best practices such as attention to documentation and re-use,4 there is also room for more innovative approaches, including a design first culture, workflow integration across RSE roles, the assessment and potential adoption of a set of modelling notation languages for the exposure mechanisms for our models.

In alignment with a “critical modelling” approach (see Bode 2020), but also with a material culture and media literacy perspective, in this paper we aim to reflect upon models by looking at the wide epistemological implications of their production and use, at the responsibilities of modellers,5 at how models come to be and what effect they have in the resources they contribute to instantiate and hence in interpretative processes of expansion as well as reduction.

Notes

[1] A preliminary version of this paper was presented at the symposium Computational Text Analysis and Historical Change, held at Humlab, Umeå University (Sweden), 4-6 September 2019.

[2] The authors currently cover different roles at King’s Digital Lab (KDL), a Research Software Engineering (RSE) team hosted by the Faculty of Arts & Humanities at King’s College London, which provides software development and infrastructure to departments in the Faculty of Arts & Humanities while also collaborating with Social Science & Public Policy as well as a range of external partners in the higher education and cultural heritage sectors.

[3] Data modelling, for example, is informed if not driven by communication and collaborative reasoning around more or less standardised graphical representations and notations in phases of reverse engineering as well as design methods. Note that KDL intend and use design methods in a wide sense ranging from techniques of requirements elicitation in pre-project analysis to data modelling and wireframing in evolutionary development (see Bennet et al. 2005). Equally, the re-integration or expansion of modelling efforts into interpretative frameworks usually rely on verbal and visual language to document code, or to explain the results of an experiment (Ciula and Marras 2019: 39).

[4] With this respect see, for example, the KDL checklist for assessment of digital outputs within the UK Research Framework Exercise (Ciula 2019).

[5] Models contribute to define and redefine objects of study which come charged with layers of scholarship and analysis, with previous selections, bias and political as well as ethical responsibilities. As the creators of new memory regimes and intermediaries to the past engaged in modelling efforts which interact and affect the materiality of our objects of study (Ciula 2017a), we bear responsibilities (Ciula 2017b). In line with ongoing discussions around the representativeness and constraints of the digital archive (e.g. Dahlström 2010; Hitchcock 2013; Prescott 2015), Bode (2018, 2020) presented some lucid analysis around modellers’ responsibilities in digital literary studies by exposing the gaps that propagate from produced literary works we know of to material preserved in the analogue archive, to selections of works that make it into digital archives to further reductions in the creation of a corpus of analysis and, last but not least, in the application of statistical modelling techniques which dictate additional powerful yet limiting constraints if not contextualised critically within an interlocked chain of bias.

References

Arianna Ciula (arianna.ciula@kcl.ac.uk), King's College London, United Kingdom, Paul Caton , King's College London, United Kingdom, Ginestra Ferraro , King's College London, United Kingdom, Brian Maher , King's College London, United Kingdom, Geoffroy Noël , King's College London, United Kingdom and Miguel Vieira , King's College London, United Kingdom

Theme: Lux by Bootswatch.