The Collections as Data movement has gained significant traction in recent years, with large-scale projects leading the way in shaping and advocating for best practice (Padilla et al 2019). These studies, along with the OpenGLAM movement, have encouraged cultural heritage organisations to make collections available in machine readable formats and to support computational research with the collections, enabling libraries to cast new light on collections and present them in new ways for digital humanities audiences.
However, while there have been a number of recent, essential studies around Collections as Data, as well as research into making collections available openly and the reasoning behind this (Pekel 2014, Terras 2015), there has been little to date from an institutional point of view about what is involved in opening up the collections in this way.
How can libraries open up collections to wider audiences? How do we turn collections into data? What challenges does this present, relating to rights, access, and data management? What ethical considerations are needed and how can libraries be transparent about decision-making processes as they generate increasing amounts of data, becoming 'producers' of their own collections?
This paper lifts the lid on the process of making data available in a national library context and considers the changes to existing activities, processes and outlook in releasing collections as data.
The National Library of Scotland launched the Data Foundry (https://data.nls.uk/) in September 2019. As part of the Library’s Digital Scholarship Service, the Data Foundry provides access to data collections including digitised collections; metadata; map and spatial data; and organisational data; with further collections such as web archive and audiovisual data planned for future release.
The Data Foundry is based on three core principles: open, transparent and practical (National Library of Scotland 2019 [1]). The platform was designed to be a clear, easy-to-use website, with tiered data downloads; clear rights information; and at-a-glance details contextualising the datasets.
Collections on the Data Foundry are published openly, in reusable formats, and the Library does not assert further copyright over the datasets it produces (National Library of Scotland 2019 [2]). Furthermore, with transparency a key principle, the Data Foundry provides information about data provenance and the reasons behind why and how certain items have been digitised and ‘turned into’ data above others (Ames 2019).
Producing the Data Foundry has been a Library-wide effort. Working at the intersection of collections, technology and research, the National Library of Scotland’s Digital Scholarship Service draws upon existing expertise across the Library – including Rights, Developers, Curators, Metadata – as well as working closely with researchers to understand their needs.
This paper will highlight the practical side of opening up library collections for digital humanities use, exploring the everyday challenges and obstacles such as rights and technical issues and changes to workflows required to produce collections as data, as well as the broader implications of making collections available at scale for libraries and their users.
References
Ames, Sarah. (2019). ‘Digital scholarship and data provenance at The National Library of Scotland’. Libraries as Research Partner in Digital Humanities. http://doi.org/10.5281/zenodo.3269291
National Library of Scotland [1] (2019). https://data.nls.uk/
National Library of Scotland [2] (2019). National Library of Scotland Open Data Publication Plan. https://data.nls.uk/download/national-library-of-scotland-open-data-publication-plan.pdf
OpenGLAM. (2019). ‘Ways forward to Open Access for cultural heritage’. OpenGLAM. https://openglam.org/2019/04/30/openglam-principles-ways-forward-to-open-access-for-cultural-heritage/
Padilla, Thomas, Allen, Laurie, Frost, Hannah, Potvin, Sarah, Russey Roke, Elizabeth, & Varner, Stewart. (2019). Final Report --- Always Already Computational: Collections as Data (Version 1). http://doi.org/10.5281/zenodo.3152935
Padilla, Thomas, Scates Ketler, Hannah, Allen, Laurie, Varner, Stewart. (2019). Collections as Data: Part to Whole. https://collectionsasdata.github.io/part2whole/
Pekel, Joris. (2014). Democratising the Rijksmuseum. Europeana Foundation. https://pro.europeana.eu/files/Europeana_Professional/Publications/Democratising%20the%20Rijksmuseum.pdf
Terras, Melissa. (2015). ‘Opening Access to Collections: the Making and Using of Open Digitised Cultural Content’. Online Information Review. http://discovery.ucl.ac.uk/1469561/1/MelissaTerras_OpeningAccess_OIR.pdf