An introduction to markup
Semantic data module
1. Why mark up text?
In a file without markup, the running text will normally be in plain text format. To take an extremely simple example, imagine a text file that just contains
In Wellington I saw a statue of Wellington
If we want to search this kind of text file for the string Wellington we can easily find both of the examples by using a find command. But there is no way to distinguish between the person and the place.
Suppose that we want to find all references to the First Duke of Wellington in a text (one that contains, rather than the eight words above, a couple of million words). We don’t want references to places called Wellington, or to the duke’s son, the Second Duke of Wellington, or any other people called Wellington. But we do want to find all of these references to the same man:
Marking up a text semantically is the most reliable way to be able to return all of the information you want to get from your search. People, places and dates are often the focus of semantic markup, but anything can be marked up – quotations, emotions, economic data or food – anything that is of interest to the researcher. In this course we will show you some ways of doing this.