Case Study: Henry III Fine Rolls
Project Title: Henry III Fine Rolls
Type: Semantic Markup
Introduction and Definition
Project definition: semantic markup is data marked up, however lightly or heavily, in ways which reflect the semantic content of a text, rather than its structure.
The Fine Rolls are documents produced in medieval England detailing payments to the king; these payments were entered onto parchment rolls and kept as a central record. Thus the Fine Rolls provide an essential insight into the relationship between the monarch and his subjects, especially royal patronage, as well as fine-grained financial insight into medieval society. The long reign of Henry III (1216-1272) is a good opportunity to make a long sequence of the rolls available online for the first time, and is particularly valuable in the case of Henry III since previously there has not been a printed edition of the Fine Rolls for this monarch, the only edition being an inadequate collection of excerpts. Until this project the only recourse for researchers was to study the rolls in The National Archives in Kew, which raised both problems of access for researchers far from London, and preservation questions for the rolls.
The project has been funded by the Arts and Humanities Research Council (AHRC) in two phases. The project team consists of staff from King’s College, London (History department and Centre for Computing in the Humanities), Canterbury Christ Church University and The National Archives.
The project has four main outputs:
- High-quality, easily navigable images of every membrane of the rolls. The high quality of the images will allow users to zoom in to high magnifications to examine aspects of the scribes’ hands and other details. The images will also be a linked, on a membrane-by-membrane basis, to:
- Full translations of the Latin text of the rolls. These translations are being supplemented by further information included only in the Originalia Rolls (copies sent to the Exchequer) but not in the Fine Rolls themselves.
- Index and search facilities for the Fine Rolls, between 1216 and 1248 in the first instance. These will list all people and places in the rolls, as well as giving a subject index. This will enable complex searches to be carried out. It is hoped that additional funding will allow the completion of the index for the rest of Henry’s reign. In the meantime the translations will be searchable in the usual but basic way with a web browser’s ‘find in page’ facility.
- Print publication of the translations and indexes for 1216-1242. Again, it is hoped that additional funding for complete indexing will allow the print publication of translations of the rolls for the entire reign.
It is notable that the project outputs do not include a transcription of the Latin text, which would make the Latin lexically searchable. Other, similar projects have been able to include transcriptions as well as translations, for example the Early English Laws project (http://www.earlyenglishlaws.ac.uk/). However this step clearly adds another round of work, and it is significant that Earl English Laws has a different project model, involving a large element of crowd-sourcing and incremental publication as new editions of individual manuscripts are completed.
Additionally, the project has committed to publishing a ‘Fine of the Month’, in which an expert comments on a particular fine, illuminating its features and historical importance. This is part of the project’s attempt to make a wider impact for material which is, in itself, somewhat arcane to the layman.
Use of tool
The encoding of the textual data is in XML (Extensible Markup Language). This is a standard for text-based data; it is endorsed by the W3C and is an open format. It is therefore independent of any platform or licensing. Because XML is so well established within the humanities, and within computing generally, it has two additional advantages:
- a suite of other technologies has grown up around XML, for instance the ability to interrogate and transform XML using the XSL family (for example XSLT, which can transform XML into other formats or restructured XML documents), as well as numerous XML editing tools – for example, the Fine Rolls project used Oxygen XML editor, a widely available and very full-featured editor popular in academic projects.
- the widespread use of XML within digital humanities projects makes it very suitable for linking to other data, as well as to re-use; given the extent to which XML is used worldwide, its sustainability as a format looks assured for some decades.
The text of the translation has been marked up using the Text Encoding Initiative (TEI) guidelines (see http://www.tei-c.org/index.xml). The TEI has extensive markup specifically designed for complex manuscript work such as that involved in this project.
To take the simplest example of each fine, these are marked in the manuscripts with a paragraph mark as each fine commences: ¶. A standard way to mark up this kind of thing within TEI is to use the div (division) element, which has an optional number, with a type and number attribute. So one way to mark up the first fine within TEI could be:
<div type=”fine” n=”1”>
<div1 type=”fine” n=”1”>
Naturally, other attributes can be added to meet the project’s requirements.
The marking up of names can be illustrated as a slightly more complex example of TEI, this time given as a markup example on the project website:
<perName key=”mark_phillip” type=”roleName”>Phillip Marc, sheriff of <placeName key=”nottinghamshire”>Nottinghamshire</placeName></persName>
The key attribute here is being used to link the descriptive text about Phillip Marc to his entry in the authority file of individuals, as is also being done with the placename authority file for Nottinghamshire. In this way TEI markup in XML is being used to generate the indexes for the project as well as allowing more comprehensive searching by including variants in the markup.
A final layer of information added to the markup is RDF (Resource Description Framework), a technology used for linked and the semantic web (for more information on linked data, see our linked data case study, on Liparm, here). The ontology language chosen for this project was OWL (Web Ontology Language) and editors used the popular open-source OWL editor Protégé OWL (http://protege.stanford.edu/). This RDF layer has two advantages:
- It allows this dataset to be linked to others, using the same ontology and, potentially, the same URIs. For example, if Phillip Marc who appears above, also appeared in another project to do with medieval Nottingham this second project could use the same URI (Uniform Resource Indicator) as that used by the Fine Rolls project, which would allow Phillip Marc to be returned in searches across both datasets.
- It allows machine reasoning over the data, where relationships explicitly declared in the OWL ontology can be the basis for machines to infer further relationships. To take a very simple example, if there is a declaration that Phillip Marc has a wife called, let’s say, Britney Marc, then it need not be declared that Britney Marc has a husband called Phillip Marc, because a reasoning engine can deduce this.
As noted above, despite being a very large undertaking, this project did not have the resources for a transcription of the Latin text. Nevertheless the Henry III Fine Rolls project is something of a gold standard in projects of this type, offering high-quality manuscript images, linked, TEI-encoded translations, and RDF-OWL encoding. There are numerous possible extensions of this methodology in ways which would build on the work that has already been done: other rolls series and documents for the same monarch, or the fine rolls for contiguous monarchs, for example; if these were encoded in the same way, with attention to cross-searching, then this valuable resource would become even more useful.
The Fine Rolls project is the outcome of a large grant from the AHRC and a multi-institution collaboration. It shows the possibilities of a high-end markup project. Nevertheless, leaving aside the unavoidable expense of professional manuscript photography, the rest of the project methodology can be employed on a much smaller scale and with little cost other than staff time. A committed researcher could produce a TEI- and RDF-OWL-encoded transcription of a smaller document, using the same editors (Oxygen and Protégé Owl) , with almost no outlay on equipment.