Between 1930 and 1954 Mary Dorothy George wrote catalogue entries for 12,553 ‘Golden Age’ satirical prints. This article examines George as a curatorial voice, an interlocutor between the archived past and her readers. It examines the labour processes that produced George’s contributions to the British Museum’s Catalogue of Political and Personal Satires, her writing as a corpus, and her interpretations therein. We argue that George’s linguistic and procedural choices have trouble the legacy of the catalogue, a system of knowledge organisation increasingly uncoupled from its circumstances of production whilst remaining foundational to the historiography of long eighteenth century British history.
Computing and the use of digital sources and resources is an everyday and essential practice in current academic scholarship. The present article gives a concise overview of approaches and methods within digital historical scholarship, focussing on the question: How have the Digital Humanities evolved and what has that evolution brought to historical scholarship? We begin by discussing techniques in which data are generated and machine searchable, such as OCR/HTR, born-digital archives, computer vision, scholarly editions, and Linked Data. In the second section, we provide examples of how data is made more accessible through quantitative text and network analysis. We close with a section on the need for hermeneutics and data-awareness in digital historical scholarship.
The technologies described in this article have had varying degrees of effect on historical scholarship, usually in indirect ways. For example, technologies such as OCR and search engines may not be directly visible in a historical argument; however, these technologies do shape how historians interact with sources and whether sources can be accessed at all. It is with this article that we aim to start to take stock of the digital approaches and methods used in historical scholarship which may serve as starting points for scholars to understand the digital turn in the field and how and when to implement such approaches in their work.
We seek to demonstrate how corpus linguistic techniques can facilitate a comprehensive account of curatorial voice in a large digitised museum catalogue and hence leverage its value as a resource for generating new knowledge about: curatorial practice; the historical and cultural contexts of curation; and, the content of collections. We worked with 1.1 million words written by the historian M. Dorothy George between 1930 and 1954 to describe 9330 late-Georgian satirical prints. George’s curatorial descriptions were analysed in terms of their typical informational content and with regards to the extent George included interpretation and evaluation in her descriptions. We discuss how results from such analyses can provide a basis for addressing questions about George’s curatorial voice and, more generally, suggest how this approach could benefit museological practice around the production of descriptions and the re-purposing of legacy catalogues for digital access and analysis of collections.
This article presents several inclusion and diversity policies and strategies for digital scholarship and pedagogy, using The Programming Historian as a case study. By actively supporting and working towards gender diversity, as well as multilingualism, cultural inclusivity and open access, The Programming Historian aims to further enhance what is meant to be open in the context of access, diversity and inclusion in digital scholarship and pedagogy.
While the advent of the home computer in 1990s Britain has been well documented by historians of computing and technology, there remains little research on the everyday experience of this phenomenon. In this article, we use material held in The Mass Observation Project (MOP) archive to explore the way men and women in late-modern Britain experienced and understood the changes brought about by home computing. The reflexive and yet private nature of responses held in the MOP archive make it an important window into the cultural and social contexts in which personal computers were encountered. Our research indicates that the advent of the home computer brought about a number of historically-specific changes in the way Mass Observers scribed and composed their written communications. The processes through which people turned ideas into text were irreversibly recalibrated by the possibilities of saving, editing, copy and pasting on screen. As personal computer resources moved from being predominantly accessible at work to being a staple part of the home, the lines between labour and leisure, business and pleasure and the personal and the professional were blurred. Ultimately, evidence from the Mass Observation Archive indicates that the advent of the home computer had a significant effect on the processes through which individuals composed a sense of self on a day-to-day basis. It introduced new tensions, possibilities and anxieties to the act of negotiating a ‘modern’ identity. Building on this insight, our paper makes interventions in two areas: the history of writing and the history of the home. Placed alongside one another, these findings open up suggestive new questions for the heavily contested historiographical trope of the late-modern ‘self’.
This report outlines the findings of a workshop organised by the Collections Information Team at Wellcome Collection and the Department of History at the University of Sussex, to bring together researchers and archival professionals to explore methods for providing access to born-digital archives.
In the first Debates in the Digital Humanities, Alan Liu argued that digital humanists risk losing a seat at the “table of debate” if they continue to emphasize tools and databases to the exclusion of cultural criticism. We want to remind the field how difficult it is to achieve the digital/critical synthesis. Liu’s challenge is a more wicked problem than we want to admit, particularly in the context of computational work.
Recent work at the Sussex Humanities Lab, a digital humanities research program at the University of Sussex, has sought to address an identified gap in the provision and use of audio feature analysis for spoken word collections. Traditionally, oral history methodologies and practices have placed emphasis on working with transcribed textual surrogates, rather than the digital audio files created during the interview process. This provides a pragmatic access to the basic semantic content, but obviates access to other potentially meaningful aural information; our work addresses the potential for methods to explore this extra-semantic information, by working with the audio directly. Audio analysis tools, such as those developed within the established field of Music Information Retrieval (MIR), provide this opportunity. This paper describes the application of audio analysis techniques and methods to spoken word collections. We demonstrate an approach using freely available audio and data analysis tools, which have been explored and evaluated in two workshops. We hope to inspire new forms of content analysis which complement semantic analysis with investigation into the more nuanced properties carried in audio signals.
Much time and energy is now being devoted to developing the skills of researchers in the related areas of data analysis and data management. However, less attention is currently paid to developing the data skills of librarians themselves: these skills are often brought in by recruitment in niche areas rather than considered as a wider development need for the library workforce, and are not widely recognised as important to the professional career development of librarians. We believe that building computational and data science capacity within academic libraries will have direct benefits for both librarians and the users we serve.
Library Carpentry is a global effort to provide training to librarians in technical areas that have traditionally been seen as the preserve of researchers, IT support and systems librarians. Established non-profit volunteer organisations such as Software Carpentry and Data Carpentry offer introductory research software skills training with a focus on the needs and requirements of research scientists. Library Carpentry is a comparable introductory software skills training programme with a focus on the needs and requirements of library and information professionals. This paper describes how the material was developed and delivered, and reports on challenges faced, lessons learned and future plans.
In disciplines such as biomedicine and social sciences, sharing and combining sensitive individual-level data is often prohibited by ethical-legal or governance constraints and other barriers such as the control of intellectual property or the huge sample sizes. DataSHIELD (Data Aggregation Through Anonymous Summary-statistics from Harmonised Individual-levEL Databases) is a distributed approach that allows the analysis of sensitive individual-level data from one study, and the co-analysis of such data from several studies simultaneously without physically pooling them or disclosing any data.
Following initial proof of principle, a stable DataSHIELD platform has now been implemented in a number of epidemiological consortia. This paper reports three new applications of DataSHIELD including application to post-publication sensitive data analysis, text data analysis and privacy protected data visualisation. Expansion of DataSHIELD analytic functionality and application to additional data types demonstrate the broad applications of the software beyond biomedical sciences.
Although there has been a drive in the cultural heritage sector to provide large-scale, open data sets for researchers, we have not seen a commensurate rise in humanities researchers undertaking complex analysis of these data sets for their own research purposes. This article reports on a pilot project at University College London, working in collaboration with the British Library, to scope out how best high-performance computing facilities can be used to facilitate the needs of researchers in the humanities. Using institutional data-processing frameworks routinely used to support scientific research, we assisted four humanities researchers in analysing 60,000 digitized books, and we present two resulting case studies here. This research allowed us to identify infrastructural and procedural barriers and make recommendations on resource allocation to best support non-computational researchers in undertaking ‘big data’ research. We recommend that research software engineer capacity can be most efficiently deployed in maintaining and supporting data sets, while librarians can provide an essential service in running initial, routine queries for humanities scholars. At present there are too many technical hurdles for most individuals in the humanities to consider analysing at scale these increasingly available open data sets, and by building on existing frameworks of support from research computing and library services, we can best support humanities scholars in developing methods and approaches to take advantage of these research opportunities.
This essay describes and reflects on the integration of computational research skills into core (that is, compulsory) components of the BA History degree programme at the University of Sussex. Work on this began in Spring/Summer 2015 and was delivered as part of two Year 1 modules that ran in the 2015/16 academic year: The Early Modern World (Autumn term) and The Making of the Modern World (Spring Term). The work was undertaken by Tim Hitchcock, Sharon Webb, and James Baker, academic staff in the Department of History with expertise in computational research through their work in Digital History and Digital Humanities.
Librarians play a crucial role in cultivating world-class research and in most disciplinary areas today world-class research relies on the use of software. This paper describes Library Carpentry, an introductory software skills training programme with a focus on the needs and requirements of library and information professionals. Using Library Carpentry as a case study of the development and delivery of software skills focused professional development, this paper describes the institutional and intellectual contexts in which Library Carpentry was conceived, the syllabus used for the initial exploratory programme, the administrative apparatus through which the programme was delivered, and the analysis of data collection exercises conducted during the programme. As many university librarians already have substantial expertise working with data, it argues that adding software skills (that is, coding and data manipulation that goes beyond the use of familiar office suites) to their armoury is an effective and important use of professional development resource.
GeoCities was a web hosting service that launched in 1995. It appealed to people because at a time when the World Wide Web was in its infancy, it offered them the ability to create websites about themselves, their interests, and their lives. This chapter examines the character and form of a selection of diaries hosted on GeoCities between 1995 and 2001. In these diaries GeoCities users tested the boundaries between public and private on the early web.
The chapter proceeds in four parts. The first part offers some contextual background and the second a discussion of method. Third, I look at web diaries whose creators experimented with self-projection by combining aspects of the diary form with the nascent web technologies that GeoCities offered. Fourth and finally, I examine those GeoCities diaries that--in variety of ways--replicated private, paper-based diaries. This blend of experimental and conservative, and of public and private diary writing provides a valuable window into ways in which identities were negotiated and performed circa 1995-2001.
This book explores English single sheet satirical prints published from 1780-1820, the people who made those prints, and the businesses that sold them. It examines how these objects were made, how they were sold, and how both the complexity of the production process and the necessity to sell shaped and constrained the satiric content these objects contained. It argues that production, sale, and environment are crucial to understanding late-Georgian satirical prints. A majority of these prints were, after all, published in London and were therefore woven into the commercial culture of the Great Wen. Because of this city and its culture, the activities of the many individuals involved in transforming a single satirical design into a saleable and commercially viable object were underpinned by a nexus of making, selling, and consumption. Neglecting any one part of this nexus does a disservice both to the late-Georgian satirical print, these most beloved objects of British art, and to the story of their late-Georgian apotheosis – a story that James Baker develops not through the designs these objects contained, but rather through those objects and the designs they contained in the making.
This article explores perceptions of the law and of how agents of the law responded to events at Covent Garden Theatre during the bitter months between mid-October and late-November 1809, the height of the Covent Garden Old Price riots. It does so through the lens of the periodical press, a vital and voluminous source of not only what happened during the riots but also of opinions on what happened and of perceptions of what happened, opinions and perceptions that are the primary concern of this article. The article begins with a discussion of how the magistrates, 'police officers', justices, and lawyers who together encompassed the guardians of the legal system were seen, where they were seen, and what they did. It moves on to examine how the actions of those guardians and the legal system they represented were reported upon. And it concludes with a discussion of how theatregoers and Londoners were seen to have responded to those actions, moving a significant element of the conflict outside of Covent Garden Theatre and into the public press in a direct response to how they were policed as threats to public order and security. It argues that the Covent Garden Old Price riots was a significant urban act of multi-class protest because of the ways that it intersected with wider late-Georgian concerns, with discursive arenas where British liberty and the freedom of her subjects were contested and at stake.
The British Library Big Data Experiment is an ongoing collaboration between British Library Digital Research and UCL Department of Computer Science (UCLCS), facilitated by UCL Centre for Digital Humanities (UCLDH), engaging computer science students with humanities research and digital libraries as part of their core assessed work.
Governments, research organisations, cultural institutions, and commercial entities have invested substantial funds creating digital assets to enable new research in the arts and humanities. These assets have grown to include millions of items and petabytes of material covering all forms of content – manuscripts, monographs, maps, images, sound, and more. Unfortunately, scholars have been unable to fully exploit these digital assets. The supporting infrastructures are restrictive. The assets are distributed unevenly across organisations and systems. Access restrictions unpredictably limit where, how and who can use items.
This poster will outline a pathway to remedy this unacceptable state of affairs. It will explore the need for a simple-to-use infrastructure for digital scholarship. Built primarily using off-the-shelf technologies and services, we argue that such an interoperable infrastructure should, as far as possible, work like something the user already knows: it should allow the researcher to bring their own content, tools and creativity to a familiar environment. Where we envisage it differing from a local PC setup is by hosting otherwise difficult to obtain and too big to download digital content, offering the computational capacity required to quickly analyse big data using automated processes, and providing network services capable of robustly supporting digitally-driven research.
A key context of this proposed poster is research infrastructure developments around cloud, virtual and remote workflows. Notable among these are ongoing cyber-infrastructure work at the HathiTrust Research Centre1 and the deployed cloud research infrastructure used by the European Bioinformatics Institute.2 Whilst these observations and experiences point to a potentially crucial role for infrastructure in humanities research, we remain mindful of the robust critiques of recent digital humanities infrastructure projects. Quinn3 These critiques have highlighted how infrastructure development should not make strong assumptions about how researchers work, what tools they need, the sorts of problems that they will strive to solve, or even the specialised standards that they will employ. Our proposed pathway avoids these known problems by suggesting that researchers must be enabled to bring their own tools, work in whatever way they want, use any workflow, and address any sort of problem. We envisage this being achieved by infrastructure development that works with many digital content providers, supports a wide range of content types, and is embedded within arts and humanities research that uses a variety of data-driven methodologies. It would support growth in big data research in the arts and humanities using researcher appropriate standards and guidelines.
The informal, conversational setting of a poster session will prove a valuable opportunity to visit the key questions and problems around digital research infrastructure. These include:
What are the benefits of scholars being able to use off-the-shelf technologies to work with big data across major content holders?How can these infrastructures enable transformative research?Do hybrid cloud infrastructures provide a sustainable approach to service provision?
Such infrastructure could establish the foundation for scholarly work with large scale content collections for years to come, enabling in turn transformative research that uncovers the value hidden in these digital assets and society to benefit from its investment. Such transformation requires leading-edge researchers, and eventually the majority of researchers, to adopt, learn and use new methods and techniques; to not just answer old questions in new ways but to arrive at new answers and to start asking entirely new questions as a consequence. This proposed infrastructure pathway aims to explore the next steps towards making this transformation a reality
This poster builds on experience providing researchers with digital content. Scholars increasingly demand scalable access to large quantities of digital content – big data – that they can analyse using their own software and tools. Early on, the amounts of digital data were small; it was possible to provide copies or enable network downloads. With the growing volumes of big data, this is no longer plausible. Instead of moving hundreds of terabytes of data to researchers, we must allow researchers to bring their tools to the data. This is consistent with changes in the broader IT landscape. We have established five principles to guide our pathway:
Keep it simple. Any new infrastructure should be simple to use and understand.Lower the bar. Any new infrastructure should not expose or require users to understand new or complex technologies or processes. It should, as much as possible, work like something they already doBring your own tools. Users should be able to employ the tools that they already understand and work with. For example, if a researcher uses Mathematica for image analysis in her office, she should be able to use it on large collections of digital assets distributed across multiple content organisations.Be creative. Users should be able to use data in creative, novel, unexpected ways. Many systems and infrastructures limit what users can do.Start small and grow big. Users should be able to try things out; explore, experiment and debug; and then deploy on large content sets.
References
1. Beth Plale, Opportunies and Challenges of Text Mining HathiTrust Digital Library, Koninklijke Bibiotheek, 15 November 2013 www.hathitrust.org/documents/kb-plalehtrc-nov2013.pdf
2. Creating a Global Alliance to Enable Responsible Sharing of Genomic and Clinical Data, 3 June 2013 www.ebi.ac.uk/sites/ebi.ac.uk/files/shared/images/News/Global_Alliance_White_Paper_3_June_2013.pdf
3. Quinn Dombrowski, What ever happened to Project Bamboo?, DH2013
This lesson will look at how research data, when organised in a clear and predictable manner, can be counted and mined using the Unix shell. The lesson builds on the lessons “Preserving Your Research Data: Documenting and Structuring Data” and “Introduction to the Bash Command Line”. Depending on your confidence with the Unix shell, it can also be used as a standalone lesson or refresher.
This lesson uses a Unix shell, which is a command-line interpreter that provides a user interface for the Unix operating system and for Unix-like systems. This lesson will cover a small number of basic commands. By the end of this tutorial you will be able to navigate through your file system and find files, open them, perform basic data manipulation tasks such as combining and copying files, as well as both reading them and making relatively simple edits. These commands constitute the building blocks upon which more complex commands can be constructed to fit your research data or project. Readers wanting a reference guide that goes beyond this lesson are recommended to read Deborah S. Ray and Eric J. Ray, Unix and Linux: Visual Quickstart Guide, 4th edition (2009).