About the Project: A Message from the PI


KITAB provides a digital toolbox and a forum for discussions about Arabic texts. We wish to empower users to explore Arabic texts in completely new ways and to expand the frontiers of knowledge about one of the world’s largest and most complex textual traditions.

We are leading with methods that detect how authors copied from previous works. Arabic authors frequently made use of past works, cutting them into pieces and reconstituting them to address their own outlooks and concerns. We are working to discover relationships between these texts and also the profoundly intertextual circulatory systems in which they sit.

Our most recent work has involved gathering statistics on such reuse across the tradition. This includes the extent and precision of reuse, and where it does and does not occur. We study this data alongside further data documenting citation practices, including transmission chains known as isnads. We are also developing new visualisations that show the relationships between authors, books and the ideas that the books contain. Equally importantly, we are building the corpus of texts upon which our research is based, and making use of our recent and pioneering work on Optical Character Recognition (OCR). In the coming years, we aim to increase our OCR efforts significantly.

The technology that powers KITAB is at the cutting edge of computer science. Our text reuse algorithm was created by David Smith and is now being adapted by Ryan Muther for Arabic. Ryan and Masoumeh Seydi are doing experimental work to map citations. We join our Open Islamicate Texts Initiative (OpenITI) partners in the OCR effort. For funding, we are grateful for the support that we have received from our home institution, the Aga Khan University, and also from the British Academy, the Qatar National Library, the Andrew W. Mellon Foundation, and the European Research Council under the Horizon 2020 grant scheme (KITAB, no. 772989).

To use our corpus, please start here. We are annotating and vetting works, with documentation available on GitHub.

Do read the blog, as it provides windows on team members at work. We are working hard to bring all of our data and sources into the public domain. We want scholars everywhere to be able to take best advantage of what digital technology now allows us all to see and to discover. You can find regular updates posted through my Twitter account @sarahsavant1.

Thank you for your interest in KITAB and please do be in touch if you would like to get involved in the project. We welcome your interest.

Warm regards,
Sarah Bowen Savant
Aga Khan University International
Institute for the Study of Muslim Civilisations
Principal Investigator
Knowledge, Information Technology, and the Arabic Book