Using the many to spot the few

At present, the OpenITI / KITAB corpus comprises 10,243 text files, 6,268 of which are unique titles.  Such a large, and growing, number of texts makes quality control challenging. But at the same time, it is precisely this large number of texts that can be the basis...

Preserving Pre-Modern Terminologies

To categorise things is a fundamental human and scholarly instinct and activity. And yet it is one not without obstacles, for we soon learn that the world is more complex than we originally thought or that we are confronted with something which refuses to conform to...