A new version (version 2020.2.3) of the OpenITI corpus is available at Zenodo, an Open Science platform that supports open access. This is the third release (second release in 2020) developed by the OpenITI organisation. It is also accessible at GitHub.
The current release features 7,725 books, including all versions and editions (1,573,262,381 words), representing 4,781 unique books written by 2,074 authors. In this release, 680 new books (with their identifiers) have been added. In terms of annotation, 773 books are available in first-stage OpenITI mARkdown, and of these 79 have been reviewed and vetted by the annotation team (this being the second stage of our annotation process). The vetted texts have the extension .mARkdown.
The new texts and changes to the unique resource identifiers (URIs) as well as statistics on the corpus are listed in the release note, which is available through the publication link above (also available here).
We are continuously adding more texts to the OpenITI corpus. If you wish to contribute to the OpenITI and add books or manuscripts that are not yet in the OpenITI, please contact Lorenz Nigst or any of the team members.
To cite this version, please use the following (the bibliographical export is also available on the publication page):
Lorenz Nigst, Maxim Romanov, Sarah Bowen Savant, Masoumeh Seydi and Peter Verkinderen, OpenITI: A Machine-Readable Corpus of Islamicate Texts (Version 2020.2.3) [data set] (November, 2020), Zenodo, http://doi.org/10.5281/zenodo.4075046.