The KITAB team has released a new version (2021.1.4) of the OpenITI corpus at Zenodo. The release is open access and freely available. It is our fourth release (first release in 2021). You can also access the release through our GitHub repository.

The release features 10,202 books, including all versions and editions (2,046,728,631 words), representing 6,236 unique books written by 2,582 authors. In the release, 2,506 new identifiers have been added. The new ids reflect either new books, which are assigned a new unique id, or changes to existing ids and corresponding URIs (resulting in new ids). In terms of structural annotation, 180 books have a new status, and of these 73 have a .mARkdown extension, which means they are available in full structural OpenITI mARkdown. This brings to 247 the total number of books in mARkdown. The mARkdown files have been reviewed and vetted by the annotation team.

The major changes to the URIs as well as statistics on the corpus are given in the release note at the publication link and the GitHub repository. We have also added a few new fields to the metadata.

To cite this version, please use the following (the bibliographical export is also available on the publication page):


Lorenz Nigst, Maxim Romanov, Sarah Bowen Savant, Masoumeh Seydi and Peter Verkinderen, OpenITI: A Machine-Readable Corpus of Islamicate Texts (Version 2021.1.4) [data set] (February, 2021), Zenodo,