Organising and Designing Quantitative Data

4. The 'Next Generation': Open Source and Web Tools

Historians have been using proprietary desktop spreadsheet and database tools for quantitative research for many years. Their advantages include that they are easy to use (at least relatively speaking!), and usually come with extensive documentation and technical support. Spreadsheet software in particular is popular because it is both simple and powerful. For a long time, these have been the best options for the majority of historians working with quantitative data.

However, their disadvantages include that they are often expensive (and have to be updated at regular intervals, at further cost). Data is stored in proprietary formats that limit sharing, are not designed for online publication, and create problems for preservation. Although they have become increasingly sophisticated and usable for complex historical sources, these programs were not created for historical research (or even for academic use).

In recent years, open source content management software, often built as browser-based applications and intended for easy online publication, has evolved and expanded rapidly. (Famous examples includeWordPress and the MediaWiki software underpinning Wikipedia , both using MySQL databases.) In addition, techniques and tools like XML markup have facilitated the creation of structured datasets and databases from large text corpuses without the manual process of entering data into a spreadsheet or database.