Blogs

2026 11
2025 7
2024 5
2023 12
2022 12
2021 20
2020 18
2019 4
2018 5
2017 1

2026

Introducing Text Evaluation using LabelStudio

April 9, 2026 4 minute read

Data annotation and evaluation is a critical part of the KITAB-Transform workflow. Without user-friendly but highly customisable software, it will be impossi...

Leveraging the OpenITI Corpus for Text Identification: Two Examples from Geniza Documents

March 11, 2026 11 minute read

Among many other things, the steadily growing OpenITI corpus of machine-actionable texts constitutes a useful tool for identifying hitherto unidentified text...

Research Workshop on Miskawayh

February 19, 2026 less than 1 minute read

Corpus Building Workshop

February 19, 2026 less than 1 minute read

Arabic Pasts 2026 - Call for Papers

February 17, 2026 4 minute read

Note: The CfP for Arabic Pasts 2026 is now closed

OpenITI Release, version 2025.1.9

February 12, 2026 1 minute read

Citation: please use the citation information (available to export in various formats) on the publication page.

Calling for OpenITI Curators (information session)

February 10, 2026 1 minute read

Would you like to help build and diversify the OpenITI corpus? Frustrated that you cannot find your text? Do you wish that OpenITI had more consistent metada...

Introducing KITAB-Transform

January 30, 2026 6 minute read

On 2 January, we began a new European Research Council-funded project, ‘KITAB-Transform’ (ERC, grant no. 101199672). We hope that scholars will join us on wh...

Back-end Engineer (Subcontractor) - ERC Project, KITAB-Transform

January 15, 2026 3 minute read

Back-end support engineer (subcontractor) - ERC Project KITAB Transform

Bias in the OpenITI corpus

January 9, 2026 12 minute read

Bias in the OpenITI corpus

Digital Lead (Subcontractor) – ERC Project KITAB-Transform

January 8, 2026 3 minute read

Digital Lead (Subcontractor) – ERC Project KITAB-Transform

2025

Arabic Pasts 2025

December 10, 2025 less than 1 minute read

The Aga Khan University’s Institute for the Study of Muslim Civilisations (AKU-ISMC), in collaboration with Queen Mary University of London, hosted a two-day...

Linear to Table, Table to Linear

October 2, 2025 1 minute read

In manuscripts dedicated to dreams and divination, one occasionally finds a short narrative involving Muḥammad Khawārazmshāh. It claims that he summoned spec...

Arabic Pasts 2025: Programme

August 21, 2025 1 minute read

This annual exploratory and informal workshop offers the opportunity to reflect on methodologies, case studies, and research agendas for investigating histor...

Workshop: Classical ML to AI in Arabic and Islamic Studies

April 15, 2025 2 minute read

Workshop: From Classical ML to AI in Arabic and Islamic Studies: A Hands-On Workshop, July 1-4, 2025

Call For Papers : Arabic Pasts 2025

April 3, 2025 2 minute read

Arabic Pasts: Histories and Historiographies

Repetition That Does Not Get Boring: Expressing Devotion in Druze Religious Poetry

February 12, 2025 13 minute read

Some might know the 1945 Buddy Kaye and Ted Mossman song Till the End of Time, made famous by Perry Como. The beginning of the lyrics run like this:

DRZCRPS: Towards a Corpus of Druze Poetry

February 12, 2025 2 minute read

Over the past few years, I have become quite interested in historical Druze religious poetry (from about the 17th–18th century CE onwards). Stumbling across ...

2024

Virtual Open House 3: Come Learn About Our Data

December 4, 2024 less than 1 minute read

Please join us for an online ‘Open House’ convened by the Centre for Digital Humanities at the Aga Khan University (International) in the United Kingdom and ...

The DNA of a Book: Reading al-Nuwayrī’s (d. 733/1333) Nihāyat al-arab fī funūn al-adab

October 17, 2024 1 minute read

We released a data set of an experiment conducted by Sarah Bowen Savant and Sohail Merchant, who wanted to understand potentially how, and from what, Shihāb ...

Arabic Pasts 2024: Programme

September 10, 2024 less than 1 minute read

We are pleased to announce the programme for this year’s Arabic Pasts workshop, running from Thursday 3rd until Friday 4th of October 2024. We have yet anoth...

Text Reuse Data Release, version 2023.1.8

June 14, 2024 1 minute read

A new version of KITAB’s text reuse data is now available to download at Zenodo, an Open Science platform that supports Open Access. The current release feat...

Call For Papers : Arabic Pasts 2024

May 28, 2024 1 minute read

Arabic Pasts: Histories and Historiographies

2023

OpenITI release 2023.1.8

October 24, 2023 1 minute read

The 8th version (2023.1.8) of the OpenITI corpus is now available at Zenodo. The release is open access and is also accessible through our GitHub repository....

Post 8: Bibliography

September 8, 2023 4 minute read

Antrim, Zayde, ‘Nostalgia for the Future: A Comparison between the Introductions to Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and al-Khaṭīb al-Baghdādī’s Taʾrīkh...

Post 7: People, Connections and Memory

September 8, 2023 13 minute read

Image yourself as a learned bookseller of the twelfth century. You have just been called in to assess the estate of a wealthy, prominent scholar who has died...

Post 6: Searches for References to written materials outside of Isnads

August 29, 2023 16 minute read

As noted in the last post, we struggled to verify book citations in the TMD, both within and outside of isnāds. We believe that our struggles reflect the cha...

Post 5: Ibn ‘Asakir’s Citation of Author Names in Isnads

August 24, 2023 47 minute read

We continue our investigation of Ibn ʿAsākir’s citations to address our third question about his working methods. When author names appear within his isnāds,...

Post 4: Ibn ‘Asakir’s Transmission Terms in Isnads

August 22, 2023 31 minute read

Our previous blog post featured a deep dive into the pool of informants whom Ibn ʿAsākir cites frequently. Now we turn to the big picture of how he says he a...

Post 3: Ibn ‘Asakir’s Direct Informants

August 17, 2023 49 minute read

Ibn ʿAsākir names many persons from whom he acquired information for the TMD. What can our data tell us about them?

Post 2: Ibn ‘Asakir and His History of Damascus, the Data Set

August 11, 2023 15 minute read

Digital humanists often say they would like to read more work in progress. Our blog posts represent such work. We worked intensively over months to create an...

Post 1: Introducing Ibn ‘Asakir and His History of Damascus

August 11, 2023 32 minute read

The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...

Arabic Pasts 2023: Programme

July 31, 2023 less than 1 minute read

We are pleased to announce the programme for this year’s Arabic Pasts, running from Thursday 5th until Friday 6th of October 2023. We have yet another exciti...

Call For Papers : Arabic Pasts 2023

March 25, 2023 1 minute read

Arabic Pasts: Histories and Historiographies

OpenITI release 2022.2.7

March 13, 2023 2 minute read

The KITAB team has released a new version (2022.2.7) of the OpenITI corpus at Zenodo. The release is open access. It is our seventh release (second release i...

2022

OpenITI release 2022.1.6

November 18, 2022 1 minute read

The KITAB team has released a new version (2022.1.6) of the OpenITI corpus at Zenodo. The release is open access. It is our fifth release (second release in ...

OpenITI and the Fihrist: Analysis

November 4, 2022 18 minute read

This is the third blog in a short series of blogs on the overlap between the OpenITI corpus and Ibn al-Nadim’s Fihrist. Please refer to the first part for a ...

OpenITI and the Fihrist: Methodology

November 4, 2022 14 minute read

This is the second blog in a short series of blogs on the overlap between the OpenITI corpus and Ibn al-Nadim’s Fihrist. Please refer to the first part for a...

OpenITI and the Fihrist

November 4, 2022 3 minute read

The corpus of texts the KITAB project uses as the basis for its research is a subsection of the OpenITI corpus. It contains Arabic-language texts of the firs...

Lecture Announcement: SHARIAsource Lab Workshop : Ibn ʿAsākir and His History of Damascus: Named Entity Recognition and Text Reuse, Sarah Bowen Savant (Harvard Law School)

September 23, 2022 less than 1 minute read

On Tuesday September 27, 2022 at 12:00-1:00PM US EST at Lewis 214, KITAB’s Sarah Bowen Savant, will lead a seminar on research in progress that uses the Open...

Arabic Pasts 2022: Programme

August 11, 2022 less than 1 minute read

We are pleased to announce the programme for this year’s Arabic Pasts. We have yet another exciting series of papers covering a range of topics and periods. ...

A Ramble Through the Cluster Data, Part 2: Quantifying and Visualising Clusters.

June 21, 2022 9 minute read

In part 1, I introduced you to the cluster data set, a second passim data set that is slightly different from the pairwise data set that the KITAB team use i...

A Close and Distant Reading of Writerly Practices: Sarah Bowen Savant’s Inaugural Lecture

May 20, 2022 1 minute read

On Thursday 5th of May 2022 Sarah Bowen Savant gave her inaugural lecture as full professor at the AKU-ISMC.

A Ramble Through the Cluster Data, Part 1: From Pairs to Clusters.

May 19, 2022 10 minute read

It should be no surprise to any reader of this blog that the KITAB project is primarily interested in studying Arabic text reuse. A large number of posts her...

Call for Papers: A Workshop on Citation (25th-26th July 2022)

April 22, 2022 1 minute read

Modeling Attribution and Acknowledgement in the Digital Humanities: Citation Practices and the Pre-Modern Arabic Book.

Call for papers: Arabic Pasts 2022

March 17, 2022 3 minute read

Arabic Pasts: Histories and Historiographies

Oh Brethren, Where Are Ye? How to search for words and phrases in the OpenITI corpus, demonstrated with the phrase ‘Ikhwan al-Safa’

February 9, 2022 14 minute read

The OpenITI corpus is designed to facilitate many different forms of computational analysis. Within the KITAB project we spend the bulk of our time fine-tuni...

2021

New KITAB visualizations

December 3, 2021 15 minute read

Much of our work at KITAB involves comparing books in order to understand their relationships. Our main tool for this is the passim software, which detects p...

Some Suggestions on Using OpenITI Corpus to Present Enhanced Digital Versions of Large Collections: The Case of al-Dhari‘a Ila Tasanif al-Shi‘a

November 22, 2021 13 minute read

Tagging the structure of the texts in OpenITI corpus is an important step towards the ultimate goal of the KITAB projectStudying the Arabic textual tradition...

Dispatches from al-Tabari 8: The Afterlife of al-Tabari in Quotations

November 1, 2021 10 minute read

In this series of blog posts, we have argued for the imperative to rethink writerly culture in ways that allow for a more meaningful exploration of al-Tabari...

Dispatches from al-Tabari 7: Text Reuse Alignments

October 25, 2021 6 minute read

Post 7: Text Reuse Alignments

OpenITI release 2021.2.5

October 20, 2021 1 minute read

The KITAB team has released a new version (2021.2.5) of the OpenITI corpus at Zenodo. The release is open access. It is our fifth release (second release in ...

Dispatches from al-Tabari 6: Sources Common to All of al-Tabari’s Works

October 19, 2021 4 minute read

The argument we are advancing in these blog posts is that when al-Tabari (d. 310/923) created his Taʾrikh al-rusul wa-l-muluk, Jamiʿ al-bayan ʿan taʾwil ay a...

Dispatches from al-Tabari 5: Reconstructing al-Tabari’s Notebooks

October 15, 2021 16 minute read

In our previous blog post, we argued that al-Tabari (d. 310/923) had to hand an extensive written collection consisting of sets of well-written notes.

Dispatches from al-Tabari 4: The Form of al-Tabari’s Sources: His Probable Notebooks

October 11, 2021 13 minute read

In the preceding posts, we showed that al-Tabari (d. 310/923) used the phrases ‘he told me’ and ‘he told us’ (haddathani/haddathana) in the Taʾrikh al-rusul ...

Dispatches from al-Tabari 3: How Many People Did al-Tabari Talk To?

October 7, 2021 12 minute read

This is the third in a series of blog posts examining al-Tabari’s (d. 310/923) citations in his Taʾrikh al-rusul wa-l-muluk, Jāmiʿ al-bayān ʿan taʾwīl āy al-...

Dispatches from al-Tabari 2: Show Me the Data!

October 5, 2021 18 minute read

You have now entered the weeds.

Dispatches from al-Tabari 1: Al-Tabari’s Direct Informants: Work on a New Data Set

October 5, 2021 18 minute read

In a series of eight blog posts, we share some results of experimental work on citations in three works by Muhammad b. Jarir al-Tabari (d. 310/923). These ar...

Arabic Pasts 2021 Programme

October 4, 2021 3 minute read

Arabic Pasts 2021 is happening from 7th - 9th of October.

Using the Many to Spot the Few

July 13, 2021 3 minute read

At present, the OpenITI/KITAB corpus comprises 10,243 text files, 6,268 of which are unique titles.

A Token Frequency Counter For OpenITI Texts

July 13, 2021 8 minute read

One of the participants in our KITAB user group asked for an easy way to find out which are the most frequently used words in a text.

From Networks to Named Entities and Back Again: Exploring Isnad Networks

May 31, 2021 9 minute read

From Networks to Named Entities and Back Again: Exploring Isnad Networks

First Five Hundred Years of the Arabic Book: The Native Origin of the Authors

April 29, 2021 10 minute read

Quantitative and macroanalytic approaches

Can Digital Humanities Be Informed by Bioinformatics? Visualising Passim Data for Multiple-Book Relationships

April 29, 2021 9 minute read

As KITAB’s research has shown, passim is an incredibly powerful tool for answering a variety of questions about book history and history in general. The algo...

Call for Papers – Arabic Pasts: Histories and Historiographies (Annual Workshop)

March 30, 2021 2 minute read

This annual exploratory and informal workshop offers the opportunity to reflect on history writing in Arabic. We encourage contributions focused on methodolo...

OpenITI release 2021.1.4

February 12, 2021 1 minute read

The KITAB team has released a new version (2021.1.4) of the OpenITI corpus at Zenodo. The release is open access and freely available. It is our fourth relea...

Diversifying the OpenITI corpus, One Text at a Time

January 21, 2021 9 minute read

The vast majority of texts in the OpenITI corpus were sourced from three major collections of digital texts originally prepared by organisations based in the...

2020

Tracing the origins of a historical fragment focused on the Samanids

December 11, 2020 2 minute read

At the Arabic Pasts conference this year, Hugh Kennedy and I presented a paper in the panel dedicated to the Invisible East programme, chaired by the program...

Al-Maktaba al-Shamila: a short history

December 3, 2020 10 minute read

(This is the first blog post in a longer series of posts about the sources of OpenITI.)

KITAB postdoc Gowaart Van Den Bossche wins BRAIS-De Gruyter dissertation prize – 2020

November 19, 2020 less than 1 minute read

The British Association for Islamic Studies (BRAIS) and De Gruyter have announced the outcome of the fifth (2020) round of the BRAIS–De Gruyter Prize in the ...

OpenITI release 2020.2.3

October 19, 2020 1 minute read

A new version (version 2020.2.3) of the OpenITI corpus is available at Zenodo, an Open Science platform that supports open access. This is the third release ...

Mapping Who’s Who in Isnads – First Steps

October 5, 2020 10 minute read

One of the major challenges for those working with historical Arabic texts lies in names, and in the variety of ways that authors might refer to the same per...

Between Manuscripts and Digital Texts: Commentaries on Hadith Raʾs al-Jalut

September 30, 2020 12 minute read

For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...

Arabic Pasts: Histories and Historiographies Research workshop (October 22-24, 2020 London)

September 29, 2020 1 minute read

This annual exploratory and informal workshop offers the opportunity to reflect on history writing in Arabic. This year the event will be held online to allo...

Adventures in Alignments: Training an Algorithm to Recognise Text Reuse

August 7, 2020 9 minute read

Text reuse is the term that we use to describe cases where one book shares verbatim material with another. Text reuse can be studied manually through the rea...

Preserving Pre-Modern Terminologies

August 5, 2020 11 minute read

To categorise things is a fundamental human and scholarly instinct and activity. And yet it is one not without obstacles, for we soon learn that the world is...

Call for Participation in KITAB (Knowledge, Information Technology, and the Arabic Book)

July 24, 2020 1 minute read

The KITAB project is seeking researchers who are interested in collaborating to advance their own, distinct research projects. The aim is to build a small gr...

OpenITI, OCR, and Textual Criticism

July 16, 2020 5 minute read

In previous posts, other members of the KITAB team have talked about building the OpenITI corpus of Arabic and Persian sources. Many members of the team are ...

Algorithmic Reading of Shiʿi Hadith Collections: Direct Borrowing and Common Sources

June 22, 2020 13 minute read

It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...

New Release of Our Open Access Arabic Corpus, OpenITI, version 2020.1.2

June 17, 2020 1 minute read

A new version of the corpus used by the KITAB team is now available to download at Zenodo, an Open Science platform that supports open access. This is the se...

Tagging the Structure of Texts in the OPENITI Corpus

June 12, 2020 5 minute read

With currently more than 7,000 titles, collected from a number of huge digital Arabic libraries (al-Jamiʿ al-Kabir, al-Maktaba al-Shamila, Shia Online, etc.)...

Contagion in the Corpus: The Black Death and Where to Find It

April 22, 2020 8 minute read

“How can I bear to pair fair words in rhyme

The New OpenITI Metadata Search

March 6, 2020 2 minute read

The OpenITI corpus was designed in a way that makes it easy for scripts to access, identify and analyse the texts in the corpus. As a human reader, it was un...

Tracking Traditions: Identifying Isnads in the OpenITI Corpus

February 3, 2020 10 minute read

Due to its size and coverage, the OpenITI corpus is useful for a wide variety of research purposes. In particular, it represents an excellent opportunity to ...

When al-Tabari is Not (Just) al-Tabari: The Challenges Posed by Composite Editions in the OpenITI Corpus

January 10, 2020 4 minute read

In the past few months the KITAB team members have been closely studying the issue of versioning and composite editions in the OpenITI corpus. The problem of...

2019

On Commentaries, Digressions, Transtextualities, and Rabbit Holes

December 3, 2019 5 minute read

Running the passim algorithm on the OpenITI corpus allows us to identify a vast number of instances of text reuse, but the quality of these results from a hi...

Judging the Differences between Arabic Text Versions Mathematically

November 14, 2019 11 minute read

Researchers working on historical Arabic texts have long known about transmission practices that resulted in considerable differences between what were osten...

A New Application that Helps You Find Texts in the OpenITI Corpus

November 4, 2019 1 minute read

The Open Islamicate Texts Initiative (OpenITI) is a multi-institutional effort to construct the first open-access machine-actionable scholarly corpus of prem...

First Open Access Release of Our Arabic Corpus

June 8, 2019 1 minute read

Scholars working in Arabic can now download the entire corpus used by the KITAB team through Zenodo, an Open Science platform that supports open access.

2017

A Tale of 3 “Versions”

September 10, 2017 11 minute read

Measuring variation in the early tradition

Blogs

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017