The Explainability Turn: Critical Digital Humanities and Explanation

1. Abstract

One of the key drivers for the attention given to explainability has been a wider public unease with the perceived bias of algorithms in everyday life, especially in the rise in automated decision processes and the calls for accountability in these systems (see Eubanks 2017, Kuang 2017, Noble 2018). Discourse and algorithms become a technique to exercise power, for example through “nudging” strategic behaviour and thereby shaping labour, both physical and mental. Through behavioural logics of control our everyday lives are subject to algorithmic management from increasingly prevalent hyper-individualised capillaries of power (Berry and Fagerjord 2017). These implications are increasingly discussed in the media and in politics, particularly in relation to a future dominated by technologies which are thought to have huge social consequences (Malabou 2019). These are important issues, but here I drill down to focus on the cognitive and explanatory issues in relation to the digital humanities. In the first half of the paper, I seek to think critically about the concept of explainability and its potential for developing a possible tactic in response to the wider toxicity generated by algorithmic governance. The aim is to offer an immanent critique of the notion of explainability. By immanent critique, I refer to an approach whereby the internal terms and concepts within a system are examined in relation to the reality of the claims they make about and the actuality of the world. In the second half of the paper I seek to show how these claims about explainability can and should be taken up by digital humanities as an explicitly interdisciplinary theoretical concern for a new research programme in the digital humanities.

I argue that the justificatory move to explainability as a panacea for computational systems is therefore an important diagnostic site for interrogating their power and ubiquity. One of the most difficult tasks facing the digital humanist today is understanding the delegation and prescription of agency in digital infrastructures. It is clear that in the context of computational systems, the first important question we need to consider is what counts as an explanation? Indeed, explanations are generally considered to be able to tell us how things work and thereby giving us the power to change our environment in order to meet our own ends. In this sense of explanation then, science is often supposed to be the best means of generating explanations (Pitt 1988: 7). So, with a stress on the importance of explanation, explainability makes it a criterion of adequacy for satisfactory use of algorithmic decision systems, and thereby legitimating their use in a multitude of settings. Thus, explainability and the underlying explanation are linked to the question of justification. I call this the “Explainability Turn.” It is a genuinely interesting question as to the extent to which explainability will be able to mitigate the public anxieties manifested when confronted with opaque automated decision systems. The scale of the challenge represented by the requirement to provide an explanation seems to me to be under-appreciated, and clearing the theoretical ground for even thinking about this problem cannot be overstated.

So, what then is an explanation? Hempel and Oppenheim (1988) argue that an explanation seeks to “exhibit and to clarify in a more rigorous manner”. Some of the examples they give include, a mercury thermometer which has been rapidly immersed in hot water and whole temperature reading can be explained using physical properties of the glass and mercury (Hempel and Oppenheim 1988: 10). An explanation therefore attempts to explain with reference to general laws. However, this causal mode of explanation can become inadequate in fields concerned with purposive behaviour, as with computational systems. In this case it is common for reference to purposive behaviour, such as in so-called machine behaviour, to be given in relation to “motivations” and therefore for teleological rather than causal explanation (Ruben 2016). Thus, the goals sought by the system are required in order to provide an explanation. Teleological approaches to explanation may also make us feel that we really understand a phenomenon because it is accounted for in terms of purposes, with which we are familiar from our own experience of purposive behaviour. One can, therefore, see a great temptation to use teleological explanation in relation to AI systems, particularly by creating a sense of an empathetic understanding of the “personalities of the agents.” In relation to explanation, therefore, explainability can be said to aim to provide an answer to the question “why?”

The concept of explainability, and the related practices of designing and building explainable systems, also have an underlying theory of general explainability and a theory of the human mind. These two theories are rarely explicitly articulated in the literature, and I want to bring them together to interrogate how explainability cannot be a mere technical response to the contemporary problem of automated decision systems, but actually requires philosophical investigation to be properly placed within its historical and conceptual milieu. Additionally, explanation is ambiguous as it may refer to the product or to a process. It certainly seems to be the case that many discussions of explainability tend to be chiefly interested in the idea of an explanatory product. Thus, an “explanatory product” can be characterised solely in terms of the kind of information it conveys, no reference to the act of explaining being required. The question therefore becomes, what information has to be conveyed in order to have explained something? This is something I argue that the digital humanities are well-qualified to develop as a research programme. For the user to challenge an explanation or appeal to a higher authority, if it were considered inadequate, requires expertise that presently is not readily available. I argue that digital humanities should become exemplars in the development of such systems, and by doing so contribute to a public humanities that seeks to provide knowledge and support outside of academia in relation to furthering and deepening knowledge of, and critique for, explainable systems.

Selected Bibliography

Berry, D. M. and Fagerjord, A. (2017) Digital Humanities: Knowledge and Critique in a Digital Age, Polity.

Eubanks, V. (2017) Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor, St Martin's Press.

Hempel and Oppenheim (1988) Studies in the Logic of Explanation, in Pitt, J.C. (ed.) Theories of Explanation, Oxford University Press.

Kuang, C. (2017) Can A.I. Be Taught to Explain Itself?, The New York Times, https://www.nytimes.com/2017/11/21/magazine/can-ai-be-taught-to-explain-itself.html

Malabou, C. (2019) Morphing Intelligence: From IQ Measurement to Artificial Brains, Columbia University Press.

Noble, S. U. (2018) Algorithms of Oppression, New York University Press.

Pitt, J.C. (1988) Theories of Explanation, Oxford University Press.

Ruben, D. H. (2016) Explaining Explanation, Routledge.