All of our texts are tagged according to a uniform, mARkdown-based annotation scheme, which ensures that the files can be easily identified and accessed using various scripts. This allows us to apply digital methods at the corpus level or select a particular part of the corpus for analysis (potentially on the basis of metadata).
Annotation further allows files to be analysed or compared at the structural level (at the level of specific chapters, sections or paragraphs). The structure of the corpus might look unfamiliar to those used to accessing texts through libraries, but its use is essential for performing digital tasks at scale.
For a detailed explanation of mARkdown and the annotation process, see here.