A message from the PI
From January 2026, we begin a new chapter. With KITAB-Transform (ERC, grant no. 101199672), we continue to investigate the ways that authors work, but now – thanks to recent advances in machine learning and natural language processing (NLP) – we can be more ambitious. Previously, we were able to detect near-verbatim matches; now we want to detect matches that are semantically similar but worded differently.
Our latest blogs
Introducing KITAB-Transform
read moreBack-end Engineer (Subcontractor...
read moreBias in the OpenITI corpus
read moreRead more blogs
Read more blogs about KITAB's corpus, methods and data
Our latest blog series
Explore the OpenITI corpus and reuse data
Click the button to start searching for books in our corpus and explore our text reuse data
Follow us
Keep up to date with the KITABis. Follow us on twitter or subscribe to our mailing list.
Get involved
Would you like to contribute to our Corpus? Do you have data you would like to share? Are you working in Classical Arabic NLP? We would love to hear from you!