Hier een Engelstalige samenvatting van mijn presentatie op de E-History workshop
Will there be a ‘digital revolution’ in the field of history? What will it mean when increasingly large sets of historical data are conserved in digital formats, and their access made possible through the Internet?
Some historians are convinced that this digital revolution will not change much in the practice of researching and writing history. In their view the new research methods can only do the same as older methods and at best will solve the problem of the scarcity of research time.
Others, however, are convinced that this quantitative leap, eliminating the time problem, inaugurates a new qualitative phase in historical research.
Whatever one’s opinions, it is for certain that historians, whether they use quantitative or qualitative methods, are given new opportunities and tools for searching larger datasets than ever before in the humanities. This also means that they are confronted with new problems, and challenges connected to the digitalization of data.
To explore these possibilities and problems WAHSP targets the exploration of a massive data set: eight million pages from daily newspapers in the Netherlands. Digitalization makes it in principle possible to do a quantitative analysis of these data, without resorting to a sample or backbone study. Theoretically, all relevant data can be included in such an analysis, without problems of perceptual bias or dependency on indexations done by other researchers. It also makes the reproducibility of the analysis in practice feasible.
There are however significant practical problems in developing a method for such an analysis. There are technical problems, for instance connected to the quality of the original data (here the newspapers), of their scans, and of sign recognition. And there are methodological problems, connected to the impossibility of actually checking all research results by physical means. How can one construct a method that does not make the historian dependent on the automated systems he or she uses, but assists the historian’s imagination and intuition?
There is of course software available for the text mining of large datasets that can be tailored to the needs of historians, for instance software used by intelligence services or marketing companies. At the moment this software is only commercially available and not particular user-friendly as it requires quite some computer programming expertise.
WAHSP will develop a user-friendly and interactive open source application for the text mining of historical data, working from the perspective and problems of historical methodology: for instance, by trying to take into account dynamic linguistic changes over time. A historian, Stephen Snelders (firstname.lastname@example.org), and a computer scientist, Daan Odijk (email@example.com), collaborate closely in constructing this application. Starting with the data set and specific historical questions about the development of public sentiments around drugs in the Netherlands between 1900 and 1945, the application is constructed and tested.
WAHSP is a collaborative project of the Descartes Centre for the History and Philosophy of the Sciences and the Humanities at Utrecht University and the Informatics Institute of the University of Amsterdam. Also involved are the Royal Library of the Netherlands and the Huygens Institute for Dutch History in The Hague. Project leader is Professor Toine Pieters (Descartes Centre; firstname.lastname@example.org).