University of Sussex
Browse
fqx020.pdf (396.86 kB)

Enabling complex analysis of large-scale digital collections: humanities research, high performance computing, and transforming access to British Library digital collections

Download (396.86 kB)
Version 2 2023-06-12, 08:40
Version 1 2023-06-09, 05:31
journal contribution
posted on 2023-06-12, 08:40 authored by Melissa Terras, James Baker, James Hetherington, David Beavan, Anne Welsh, Helen O'Neill, Will Finley, Oliver Duke-Williams, Adam Farquhar
Although there has been a drive in the cultural heritage sector to provide large-scale, open data sets for researchers, we have not seen a commensurate rise in humanities researchers undertaking complex analysis of these data sets for their own research purposes. This article reports on a pilot project at University College London, working in collaboration with the British Library, to scope out how best high-performance computing facilities can be used to facilitate the needs of researchers in the humanities. Using institutional data-processing frameworks routinely used to support scientific research, we assisted four humanities researchers in analysing 60,000 digitized books, and we present two resulting case studies here. This research allowed us to identify infrastructural and procedural barriers and make recommendations on resource allocation to best support non-computational researchers in undertaking ‘big data’ research. We recommend that research software engineer capacity can be most efficiently deployed in maintaining and supporting data sets, while librarians can provide an essential service in running initial, routine queries for humanities scholars. At present there are too many technical hurdles for most individuals in the humanities to consider analysing at scale these increasingly available open data sets, and by building on existing frameworks of support from research computing and library services, we can best support humanities scholars in developing methods and approaches to take advantage of these research opportunities.

History

Publication status

  • Published

File Version

  • Published version

Journal

Digital Scholarship in the Humanities

ISSN

2055-7671

Publisher

Oxford University Press

Issue

2

Volume

33

Page range

456-466

Department affiliated with

  • History Publications

Research groups affiliated with

  • Sussex Humanities Lab Publications

Full text available

  • Yes

Peer reviewed?

  • Yes

Legacy Posted Date

2017-03-22

First Open Access (FOA) Date

2017-05-12

First Compliant Deposit (FCD) Date

2017-03-21

Usage metrics

    University of Sussex (Publications)

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC