Instructions: Clicking on the section name will show / hide the section.
Introduction to Digital Humanities
Dr Christopher Ohge (Institute of English Studies, School of Advanced Study, University of London)
Marty Steer (Digital Humanities, School of Advanced Study, University of London)
Jonathan Blaney (Institute of Historical Research, School of Advanced Study, University of London)
Kristen Schuster (Digital Humanities, King's College London)
Naomi Wells (Institute of Modern Languages Research, School of Advanced Study, University of London)
This module offers a broad historical overview and a selection of methodologies of digital humanities (DH). No introduction of DH could cover everything, so this module aims to provide a curated set of directions that one could take. But it does have two underlying principles: software architectures and fundamentals, and the ways that digital tools enhance or reshape literary and cultural studies, literary criticism, and the study of material objects in virtual spaces. As DH is a practical enterprise by nature, you will be expected to engage in hands-on projects that use digital tools to enlighten your current research or creative interests.
We will also investigate several technologies relevant to digital scholarship, including fundamental skills such as the command line, pattern recognition with regular expressions, and the version-control system Git; markup languages such as Markdown, the eXtensible Markup Language (XML), and its associated scholarly guidelines the Text Encoding Initiative (TEI) for encoding scholarly texts; and text and data visualisation platforms, such as Hugo and Omeka, Voyant Toos and many others. Each week will consist of two sessions, the reading/lecture portion (consisting of pre-recorded or asynchronous materials) and the demo/discussion portion (with synchronous practical exercises and discussions over Zoom).
- Software Carpentry
- Data modelling (formalising scholarly texts and research questions)
- Data representation and visualisation
- Software development and project life-cycles
- Software architectures and dependency libraries (maximal v. minimal computing)
- Forensics and archives
- Text mining, machine learning, and artificial intelligence
- Critical thinking about histories, infrastructures, and diversity in technology
Susan Schreibman, Ray Siemens, and John Unsworth (eds.), A Companion to Digital Humanities. Oxford: Blackwell, 2004.
Matthew K. Gold and Lauren F. Klein (eds.), Debates in Digital Humanities, University of Minnesota Press, 2019. Open access version: https://dhdebates.gc.cuny.edu/projects/debates-in-the-digital-humanities-2019
Safiya Umoja Noble, Algorithms of Oppression: How Search Engines Reinforce Racism, New York University Press, 2018. [PDFs of selections are linked in Session 10]
The above texts will be supplemented by online and PDF readings.
By the end of the module, students will be able to
- articulate the main historical developments of DH
- distinguish between the uses of DH in different humanities and social science disciplines
- have a basic competence of software carpentry skills (command line interface, grep, and basic markup)
- understand the role of data models
- have a basic competence of the encoding scholarly documents using TEI XML
Schedule with Linked Readings
Introduction: Philosophy and History
Inventions of Technology
This lecture covers the philosophical questions arising from human technological development.
History of Humanities Computing
Computing actually has a long tradition, one that far predates the computing machines of the 20th century. Review the timeline below to get a sense of major developments.
Software Carpentry: Command line and Version control
What is Software Carpentry?
This week we would like you to learn to spend a few hours teaching yourselves the Unix Shell and Version Control using Git.
The Software Carpentry courses come from the Library carpentry program, are open source and available online at the links below. On Friday we will review how you got on, discuss where you got stuck (and how you learnt to get unstuck!) and then discuss how these data skills will be used in the later weeks of the course.
The optional parts are more advanced. They will give you a good idea of the purpose and power of learning to use the Shell. Please attempt them if you can (or read them and bring your questions to discuss with the class on Friday).
Library Carpentry: The UNIX Shell
Introduction to Git
We will review on Friday
Hosting your own static HTML website can be done using git and github pages. This will come up again during this course.
Seminar in Command Line and GitClick on the link below to review the live Zoom seminar for this session. The passcode is
Class notes of the CLI commands demonstrated in the live session.
NB: This is a markdown file (open it with a text editor)
Perspectives: History and Literature, Quantitative and Qualitative
Quantitative and Qualitative Digital Research
See the lecture page below for a description of two crucial aspects of digital research methods.
Literature, History, and DH: An Overview
Exercise 1: Book Worm
1. Go to the Hathi Trust Bookworm tool: https://bookworm.htrc.illinois.edu/develop/
2. Search for some word trends that you are interested in.
3. Now go to the Hathi Trust topic explorer at https://jgoodwin.net/htb/#/model/grid
4. Use the GUI to investigate
Why does the GUI help?
Refer to the source texts to learn more.
Are the topics accurate in context?
Post your findings on our class forum or be prepared to discuss at the next seminar session.
Exercise 2: from Git to Voyant and back
1. Go to our GitHub repo at https://github.com/cmohge1/riga-intro-to-dh.
2. Use the command line to clone the repo.
3. Again use the command line to navigate to the directory with the text files, then run a command to combine all the text files into a single txt file.
4. Upload the combined txt into Voyant Tools (https://voyant-tools.org/).
5. Now use your own txt file, upload it to Voyant, and compare the results to the previous step.
6. Add, commit, and push your personal txt file(s) from (5) into our GitHub repo.
New Media and Materialities
What is 'New Media Studies'?
The lecture video below surveys the foundational work of Marshal McLuhan and his predecessors in the field.
Materiality and Mechanisms
The invention of the computer (Babbage and Lovelace) in the early nineteenth century and the invention of media (Daguerre) occurred around the same time. Both technologies were built on previous models and infrastructures of calculating and storing information. Even the Universal Turing Machine presented a model that looked like a film projector.
An essential aspect of DH and media studies is understanding infrastructures and the underlying material makeup of digital media. It turns our that the infrastructures are built on established technologies that adapted for particular purposes, and that the digital tools we use are mediated by various layers of abstraction and translation.
Manovich (1999) lays out four principles for new media.
1. Discrete representation (or 'fractal' structure) of digital on different scales: data structures remain the same, they have qualities independent of any programme or presentation
2. Numerical representation of computable objects: media is formally represented (as numbers) and can therefore be subject to algorithms (i.e. computations). Digital code is subjected to computation, in other words.
3. Automation: based on (2), any media can be reproduced and modified automatically without human intervention (templates, gamification, access engines, and so on).
4. Variability: every kind of media is fluid -- it is adaptable and always subject to change.
The logic of new media corresponds to a post-industrial logic of "production on demand" and "just in time" delivery. Minimum viable product is the primary aim, and then subsequent versions are released after the data has been improved. Such processes are possible by virtue of digital computers and networks in industrial systems.
Data Modelling: Markup, Metadata, Annotation
What is a data model? Modelling constitutes an injunction to think logically and structurally. The data model is the abstraction of your research; it is a visual representation of your information system, and how the bits of information are connected. Data modelling starts with markup, as markup is the way you characterise texts and metadata for your information system. The lecture below surveys the concept of markup as well as its current formats in digital media.
Markdown: A Lightweight Markup Language
Featuring Dr Kristen Schuster (King's College London)
Download the file, select 'Slide Show', then 'Play from the Start'. Clicking on the volume icon or the right arrow button will trigger the lecture recording for each slide.
Annotation with Recogito
Exercise 1: Markdown transcription
- Download the page of Alexander Pope's Dunciad here.
- Take note of some of the challenges of modelling this text, what you would like to represent, and how.
- Encode the page in Markdown in your preferred text editor. (Atom is a good option, but you could also use an online editor at https://dillinger.io/.) Hint: start with transcription and basic text structure first.)
- Preview the encoded page in the text editor.
- Open the html rendering of the file in your text editor.
Exercise 2: Annotation with Recogito
Go to Recogito: https://recogito.pelagios.org/
- Please annotate as many places and placenames as you can, and try to georesolve them with reference to an appropriate gazetteer.
- Think about some other feature you would like to annotate, free-form, using some combination of tags, keywords, or URIs in the Recogito pop-up. What information are you adding? What are you losing?
- Look at the internal map visualisation. Do you see any problems? Anything you can fix?
- Export your annotations in .CSV and open in Excel/Spreadsheets. What information is in this file? Is there anything you cannot identify? What is not in there?
"We use “minimal computing” to refer to computing done under some set of significant constraints of hardware, software, education, network capacity, power, or other factors." - Minimal Computing Working Group
Optionally take a read through the other thought pieces from the Minimal Computing working group, and consider how you have used (or not used) these principles in your own DH research projects.
The Summer of Puppets
Below is a 4-part example project from 2017 which explains the types of questions, choices, methods, tools, data transformations and minimal computing approaches used to make a legacy DH project which was due to be decommissioned, more sustainable.
Next week we will circle back to software architectures and how these are used in digital web infrastructures (content management systems and databases) and digital resource publishing (workflows and processes) but for this week we would like you to write a short blog post about your own Digital Humanities research project, describing how you made two or three 'minimal computing' decisions (or rather, describe the decisions you did not make but probably should have!).
The Summer of Japanese Puppets case study should give you some nice ideas, and so will the Minimal Definitions reading, but make sure you write about your own project. If you don't have your own project, write about a Digital Humanities project which you have enjoyed using - i.e. interrogate someone else's project - did they make any obvious minimal computing decisions?
Exercise 1: Write a blog post and publish it
Hopefully you've written your blog post in markdown. If you didn't, write your blog post in markdown. :-) Then, using the command line and version control with git, commit and push your markdown file to your own github repository.
For those who are feeling more adventurous, instead of publishing your markdown file, build it into your own personal website. You could convert markdown to HTML and push to the repository, or you could learn about and install HugoCMS (which is a static content management system).
Seminar on Minimal Computing
Software Architecture workshop
What is Software Architecture?
This short video is just to explain the three key aspects of what "software architecture" is:
- a good shared understanding of the system,
- decisions which are hard to change, and
- "the important stuff; Whatever that is."
Martin Fowler's website contains a large collection of software engineering/industry related articles. Appraise yourself according to your own degree of interest: https://martinfowler.com/architecture/
Database and bibliographic metadata publishing and preservation system. Pay attention to the different software systems and how the data is integrated between them.
We will go through these examples briefly at the start of the class on Friday, and the remaining time will be dedicated to helping you each get comfortable pushing your own website to github.
Build your website
Following on from the demo last week, this week we would like you to build your own website. What you write about is up to you, but you must write about your own Digital Humanities topics, and connect your writing to the ideas you have learnt during this course.
For example, maybe you will write a short methodological article about your digital project, with images of your ancient manuscripts, snippets of the command line commands you used to manipulate CSV data, and any visualisations of your data and your interpretation of those visualisations.
Or, perhaps you have already written notes on the readings from weeks 1, 3, 4 or 5 and you would like to publish this text as a website in a series of 500-1000 word blog posts.
Or perhaps you want to put your CV online.
The content you publish is up to you.
The website will be your main "digital assignment" for this course.
Hugo CMS basics
The selected documentation pages from the Hugo docs will give you the basics you need to manage your website page content.
Seminar on Software Architecture
XML and TEI for Scholarly Texts
Introducing XML and TEIBelow are two lectures and some examples covering the basics of XML and the TEI Guidelines, as well as some examples of encoding projects and text structures.
Read through this slideshow to (re)familiarise yourself with basic TEI XML structure.
Use this demo to see how you can quickly transform your Markdown file from Week 5 into basic TEI.
TEI Markup Exercise
1. Use your XML file that you just converted from Markdown, and open it in your preferred text editor. (If you would like to access the Pope Markdown file, you can download it here.)
2. Complete the TEI structure
- fill in the required header metadata
- include appropriate poetry elements
- add attributes to the poetry elements (to be more precise about the parts of the poem)
- make sure the poem's notes are appropriately encoded
3. Add some additional semantic information (rhyme, people, organisations?).
TEI XML workshop
This will be an open workshop session. By Friday you will be expected to have
- Created a Hugo site for your personal research project;
- Created some Markdown files documenting your research project;
- Started to tag some documents relating to your research in TEI XML.
On Friday we will review your progress, troubleshoot any technical issues you may be having, and discuss possibilities for next steps.
In the meantime, you might also use this time to review some of the material from previous weeks.
Rethinking Data: Diversity, Postcolonialism, and Capitalism
Guest Lecture: Critical DH
Featuring Dr Naomi Wells (Institute of Modern Languages Research, University of London)
Johanna Drucker, “Humanities Approaches to Graphical Display,” Digital Humanities Quarterly 5.1 (2011). http://digitalhumanities.org/dhq/vol/5/1/000091/000091.html
Safiya Noble, Algorithms of Oppression. New York University Press, 2018. Download selections here.
Megan Ward with Adrian S. Wisnicki, ‘The Archive after Theory’ in Debates in Digital Humanities (2019) https://dhdebates.gc.cuny.edu/read/untitled-f2acf72c-a469-49d8-be35-67f9ac1e3a60/section/a8eccb81-e950-4760-ba93-38e0b1f2b9d0#ch18
(Optional) Roopika Risam, New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis, and Pedagogy. Evanston: Northwestern University Press, 2018.
Seminar on Rethinking Data