Alex B, Grover C, Oberlander J, Thompson T, Anderson M, Loxley J, Hinrichs U & Zhou K (2017) Palimpsest: Improving Assisted Curation of Loco-specific Literature. Digital Scholarship in the Humanities, 32 (Supplement 1), pp. 4-16. https://doi.org/10.1093/llc/fqw050
Text mining and information visualisation techniques applied to large-scale historical and literary document collections have enabled new types of humanities research. The assumption behind such efforts is often that trends will emerge from the analysis despite errors for individual data points and that noise will be dominated by the signal in the data. However, for some text analysis tasks, the technology is unable to perform as well as domain experts, perhaps because it does not have sufficient world knowledge or metadata available. Yet, the advantage of language processing technology is that it can process at scale, even if not perfectly accurately. Geo-locating literary works is one example where human expert knowledge is invaluable when it comes to distinguishing between candidate works. This was the underlying assumption in Palimpsest, an interdisciplinary digital humanities research project on mining literary Edinburgh. From the outset, the project adopted an assisted curation process whereby the automatic processing of large data collections was combined with manual checking to identify literary works set in Edinburgh. In this article, we introduce the assisted curation process and evaluate how the feedback from literary scholars helped to improve the technology, thereby highlighting the importance of placing humanities research at the core of digital humanities projects.
Digital Scholarship in the Humanities: Volume 32, Issue Supplement 1