The KITAB team has released a new version (2021.1.4) of the OpenITI corpus at Zenodo. The release is open access and freely available. It is our fourth release (first release in 2021) You can also access the release through our GitHub repository.

 The release features 10,202 books, including all versions and editions (2,046,728,631 words), of which 6,236 are unique books written by 2,582  authors. In the release, 2,506 new ids have been added. The new ids include either new books, which are assigned a new unique id, or changes to existing ids and corresponding URIs (resulting in new ids). In terms of structural annotation, 180 books have a new status, of which 73 have a .mARkdown extension, which means they are fully available in structural OpenITI mARkdown. This brings to 247 the total number of books in mARkdown.The mARkdown files have been reviewed and vetted by the annotation team. 

The major changes to the URIs as well as the statistics on the corpus are available in the release note at the publication link and the GitHub repository. We have also added a few new fields to the metadata.

To cite this version please use the following manner. The bibliographical export is also available at the publication page:


Lorenz Nigst, Maxim Romanov, Sarah Bowen Savant, Masoumeh Seydi, & Peter Verkinderen. (2021). OpenITI: a Machine-Readable Corpus of Islamicate Texts (Version 2021.1.4) [Data set]. Zenodo.


