Post 1: Introducing Ibn ‘Asakir and His History of Damascus
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
Antrim, Zayde, ‘Nostalgia for the Future: A Comparison between the Introductions to Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and al-Khaṭīb al-Baghdādī’s Taʾrīkh...
Image yourself as a learned bookseller of the twelfth century. You have just been called in to assess the estate of a wealthy, prominent scholar who has died...
As noted in the last post, we struggled to verify book citations in the TMD, both within and outside of isnāds. We believe that our struggles reflect the cha...
We continue our investigation of Ibn ʿAsākir’s citations to address our third question about his working methods. When author names appear within his isnāds,...
Our previous blog post featured a deep dive into the pool of informants whom Ibn ʿAsākir cites frequently. Now we turn to the big picture of how he says he a...
Ibn ʿAsākir names many persons from whom he acquired information for the TMD. What can our data tell us about them?
Digital humanists often say they would like to read more work in progress. Our blog posts represent such work. We worked intensively over months to create an...
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
Quantitative and macroanalytic approaches
Text reuse is the term that we use to describe cases where one book shares verbatim material with another. Text reuse can be studied manually through the rea...
It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...
Running the passim algorithm on the OpenITI corpus allows us to identify a vast number of instances of text reuse, but the quality of these results from a hi...
Measuring variation in the early tradition
Quantitative and macroanalytic approaches
The vast majority of texts in the OpenITI corpus were sourced from three major collections of digital texts originally prepared by organisations based in the...
For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...
It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...
“How can I bear to pair fair words in rhyme
Running the passim algorithm on the OpenITI corpus allows us to identify a vast number of instances of text reuse, but the quality of these results from a hi...
Antrim, Zayde, ‘Nostalgia for the Future: A Comparison between the Introductions to Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and al-Khaṭīb al-Baghdādī’s Taʾrīkh...
Image yourself as a learned bookseller of the twelfth century. You have just been called in to assess the estate of a wealthy, prominent scholar who has died...
As noted in the last post, we struggled to verify book citations in the TMD, both within and outside of isnāds. We believe that our struggles reflect the cha...
We continue our investigation of Ibn ʿAsākir’s citations to address our third question about his working methods. When author names appear within his isnāds,...
Our previous blog post featured a deep dive into the pool of informants whom Ibn ʿAsākir cites frequently. Now we turn to the big picture of how he says he a...
Ibn ʿAsākir names many persons from whom he acquired information for the TMD. What can our data tell us about them?
Digital humanists often say they would like to read more work in progress. Our blog posts represent such work. We worked intensively over months to create an...
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
This is the third blog in a short series of blogs on the overlap between the OpenITI corpus and Ibn al-Nadim’s Fihrist. Please refer to the first part for a ...
This is the second blog in a short series of blogs on the overlap between the OpenITI corpus and Ibn al-Nadim’s Fihrist. Please refer to the first part for a...
The corpus of texts the KITAB project uses as the basis for its research is a subsection of the OpenITI corpus. It contains Arabic-language texts of the firs...
In part 1, I introduced you to the cluster data set, a second passim data set that is slightly different from the pairwise data set that the KITAB team use i...
It should be no surprise to any reader of this blog that the KITAB project is primarily interested in studying Arabic text reuse. A large number of posts her...
From Networks to Named Entities and Back Again: Exploring Isnad Networks
The vast majority of texts in the OpenITI corpus were sourced from three major collections of digital texts originally prepared by organisations based in the...
(This is the first blog post in a longer series of posts about the sources of OpenITI.)
For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...
Text reuse is the term that we use to describe cases where one book shares verbatim material with another. Text reuse can be studied manually through the rea...
It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...
“How can I bear to pair fair words in rhyme
In the past few months the KITAB team members have been closely studying the issue of versioning and composite editions in the OpenITI corpus. The problem of...
Running the passim algorithm on the OpenITI corpus allows us to identify a vast number of instances of text reuse, but the quality of these results from a hi...
With text reuse detection, we rely on the power, speed and memory of a computer to find common passages between texts.
Measuring variation in the early tradition
Modeling Attribution and Acknowledgement in the Digital Humanities: Citation Practices and the Pre-Modern Arabic Book.
From Networks to Named Entities and Back Again: Exploring Isnad Networks
One of the major challenges for those working with historical Arabic texts lies in names, and in the variety of ways that authors might refer to the same per...
For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...
It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...
Due to its size and coverage, the OpenITI corpus is useful for a wide variety of research purposes. In particular, it represents an excellent opportunity to ...
The 8th version (2023.1.8) of the OpenITI corpus is now available at Zenodo. The release is open access and is also accessible through our GitHub repository....
The KITAB team has released a new version (2022.2.7) of the OpenITI corpus at Zenodo. The release is open access. It is our seventh release (second release i...
The KITAB team has released a new version (2022.1.6) of the OpenITI corpus at Zenodo. The release is open access. It is our fifth release (second release in ...
This is the third blog in a short series of blogs on the overlap between the OpenITI corpus and Ibn al-Nadim’s Fihrist. Please refer to the first part for a ...
This is the second blog in a short series of blogs on the overlap between the OpenITI corpus and Ibn al-Nadim’s Fihrist. Please refer to the first part for a...
The corpus of texts the KITAB project uses as the basis for its research is a subsection of the OpenITI corpus. It contains Arabic-language texts of the firs...
The OpenITI corpus is designed to facilitate many different forms of computational analysis. Within the KITAB project we spend the bulk of our time fine-tuni...
Tagging the structure of the texts in OpenITI corpus is an important step towards the ultimate goal of the KITAB projectStudying the Arabic textual tradition...
The KITAB team has released a new version (2021.2.5) of the OpenITI corpus at Zenodo. The release is open access. It is our fifth release (second release in ...
At present, the OpenITI/KITAB corpus comprises 10,243 text files, 6,268 of which are unique titles.
Quantitative and macroanalytic approaches
The KITAB team has released a new version (2021.1.4) of the OpenITI corpus at Zenodo. The release is open access and freely available. It is our fourth relea...
The vast majority of texts in the OpenITI corpus were sourced from three major collections of digital texts originally prepared by organisations based in the...
At the Arabic Pasts conference this year, Hugh Kennedy and I presented a paper in the panel dedicated to the Invisible East programme, chaired by the program...
(This is the first blog post in a longer series of posts about the sources of OpenITI.)
A new version (version 2020.2.3) of the OpenITI corpus is available at Zenodo, an Open Science platform that supports open access. This is the third release ...
For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...
To categorise things is a fundamental human and scholarly instinct and activity. And yet it is one not without obstacles, for we soon learn that the world is...
In previous posts, other members of the KITAB team have talked about building the OpenITI corpus of Arabic and Persian sources. Many members of the team are ...
A new version of the corpus used by the KITAB team is now available to download at Zenodo, an Open Science platform that supports open access. This is the se...
With currently more than 7,000 titles, collected from a number of huge digital Arabic libraries (al-Jamiʿ al-Kabir, al-Maktaba al-Shamila, Shia Online, etc.)...
“How can I bear to pair fair words in rhyme
In the past few months the KITAB team members have been closely studying the issue of versioning and composite editions in the OpenITI corpus. The problem of...
Scholars working in Arabic can now download the entire corpus used by the KITAB team through Zenodo, an Open Science platform that supports open access.
The Open Islamicate Texts Initiative (OpenITI) is a multi-institutional effort to construct the first open-access machine-actionable scholarly corpus of prem...
Antrim, Zayde, ‘Nostalgia for the Future: A Comparison between the Introductions to Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and al-Khaṭīb al-Baghdādī’s Taʾrīkh...
Image yourself as a learned bookseller of the twelfth century. You have just been called in to assess the estate of a wealthy, prominent scholar who has died...
As noted in the last post, we struggled to verify book citations in the TMD, both within and outside of isnāds. We believe that our struggles reflect the cha...
We continue our investigation of Ibn ʿAsākir’s citations to address our third question about his working methods. When author names appear within his isnāds,...
Our previous blog post featured a deep dive into the pool of informants whom Ibn ʿAsākir cites frequently. Now we turn to the big picture of how he says he a...
Ibn ʿAsākir names many persons from whom he acquired information for the TMD. What can our data tell us about them?
Digital humanists often say they would like to read more work in progress. Our blog posts represent such work. We worked intensively over months to create an...
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...
We released a data set of an experiment conducted by Sarah Bowen Savant and Sohail Merchant, who wanted to understand potentially how, and from what, Shihāb ...
Quantitative and macroanalytic approaches
For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...
Text reuse is the term that we use to describe cases where one book shares verbatim material with another. Text reuse can be studied manually through the rea...
It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...
Running the passim algorithm on the OpenITI corpus allows us to identify a vast number of instances of text reuse, but the quality of these results from a hi...
We are pleased to announce the programme for this year’s Arabic Pasts workshop, running from Thursday 3rd until Friday 4th of October 2024. We have yet anoth...
Arabic Pasts: Histories and Historiographies
We are pleased to announce the programme for this year’s Arabic Pasts, running from Thursday 5th until Friday 6th of October 2023. We have yet another exciti...
Arabic Pasts: Histories and Historiographies
On Tuesday September 27, 2022 at 12:00-1:00PM US EST at Lewis 214, KITAB’s Sarah Bowen Savant, will lead a seminar on research in progress that uses the Open...
We are pleased to announce the programme for this year’s Arabic Pasts. We have yet another exciting series of papers covering a range of topics and periods. ...
On Thursday 5th of May 2022 Sarah Bowen Savant gave her inaugural lecture as full professor at the AKU-ISMC.
Modeling Attribution and Acknowledgement in the Digital Humanities: Citation Practices and the Pre-Modern Arabic Book.
Arabic Pasts: Histories and Historiographies
This annual exploratory and informal workshop offers the opportunity to reflect on history writing in Arabic. We encourage contributions focused on methodolo...
This annual exploratory and informal workshop offers the opportunity to reflect on history writing in Arabic. This year the event will be held online to allo...
The ‘Arabic Pasts: Histories and Historiography’ workshop was held in the new Aga Khan Centre in London on the 12th and 13th of October and featured papers t...
Quantitative and macroanalytic approaches
For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...
Quantitative and macroanalytic approaches
From Networks to Named Entities and Back Again: Exploring Isnad Networks
Text reuse is the term that we use to describe cases where one book shares verbatim material with another. Text reuse can be studied manually through the rea...
In previous posts, other members of the KITAB team have talked about building the OpenITI corpus of Arabic and Persian sources. Many members of the team are ...
It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...
Due to its size and coverage, the OpenITI corpus is useful for a wide variety of research purposes. In particular, it represents an excellent opportunity to ...
The OpenITI corpus is designed to facilitate many different forms of computational analysis. Within the KITAB project we spend the bulk of our time fine-tuni...
Tagging the structure of the texts in OpenITI corpus is an important step towards the ultimate goal of the KITAB projectStudying the Arabic textual tradition...
Please join us for an online ‘Open House’ convened by the Centre for Digital Humanities at the Aga Khan University (International) in the United Kingdom and ...
A new version of KITAB’s text reuse data is now available to download at Zenodo, an Open Science platform that supports Open Access. The current release feat...
Arabic Pasts: Histories and Historiographies
Arabic Pasts: Histories and Historiographies
The KITAB team has released a new version (2022.2.7) of the OpenITI corpus at Zenodo. The release is open access. It is our seventh release (second release i...
The KITAB team has released a new version (2022.1.6) of the OpenITI corpus at Zenodo. The release is open access. It is our fifth release (second release in ...
On Thursday 5th of May 2022 Sarah Bowen Savant gave her inaugural lecture as full professor at the AKU-ISMC.
Modeling Attribution and Acknowledgement in the Digital Humanities: Citation Practices and the Pre-Modern Arabic Book.
The KITAB team has released a new version (2021.2.5) of the OpenITI corpus at Zenodo. The release is open access. It is our fifth release (second release in ...
This annual exploratory and informal workshop offers the opportunity to reflect on history writing in Arabic. We encourage contributions focused on methodolo...
The KITAB team has released a new version (2021.1.4) of the OpenITI corpus at Zenodo. The release is open access and freely available. It is our fourth relea...
The British Association for Islamic Studies (BRAIS) and De Gruyter have announced the outcome of the fifth (2020) round of the BRAIS–De Gruyter Prize in the ...
This annual exploratory and informal workshop offers the opportunity to reflect on history writing in Arabic. This year the event will be held online to allo...
The KITAB project is seeking researchers who are interested in collaborating to advance their own, distinct research projects. The aim is to build a small gr...
The ‘Arabic Pasts: Histories and Historiography’ workshop was held in the new Aga Khan Centre in London on the 12th and 13th of October and featured papers t...
The European Research Council has awarded KITAB a five-year, €2 million grant that will enable us to make major progress on our research agenda.
A new version (version 2020.2.3) of the OpenITI corpus is available at Zenodo, an Open Science platform that supports open access. This is the third release ...
Scholars working in Arabic can now download the entire corpus used by the KITAB team through Zenodo, an Open Science platform that supports open access.
Much of our work at KITAB involves comparing books in order to understand their relationships. Our main tool for this is the passim software, which detects p...
Antrim, Zayde, ‘Nostalgia for the Future: A Comparison between the Introductions to Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and al-Khaṭīb al-Baghdādī’s Taʾrīkh...
Image yourself as a learned bookseller of the twelfth century. You have just been called in to assess the estate of a wealthy, prominent scholar who has died...
As noted in the last post, we struggled to verify book citations in the TMD, both within and outside of isnāds. We believe that our struggles reflect the cha...
We continue our investigation of Ibn ʿAsākir’s citations to address our third question about his working methods. When author names appear within his isnāds,...
Our previous blog post featured a deep dive into the pool of informants whom Ibn ʿAsākir cites frequently. Now we turn to the big picture of how he says he a...
Ibn ʿAsākir names many persons from whom he acquired information for the TMD. What can our data tell us about them?
Digital humanists often say they would like to read more work in progress. Our blog posts represent such work. We worked intensively over months to create an...
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
Quantitative and macroanalytic approaches
Text reuse is the term that we use to describe cases where one book shares verbatim material with another. Text reuse can be studied manually through the rea...
Antrim, Zayde, ‘Nostalgia for the Future: A Comparison between the Introductions to Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and al-Khaṭīb al-Baghdādī’s Taʾrīkh...
Image yourself as a learned bookseller of the twelfth century. You have just been called in to assess the estate of a wealthy, prominent scholar who has died...
As noted in the last post, we struggled to verify book citations in the TMD, both within and outside of isnāds. We believe that our struggles reflect the cha...
We continue our investigation of Ibn ʿAsākir’s citations to address our third question about his working methods. When author names appear within his isnāds,...
Our previous blog post featured a deep dive into the pool of informants whom Ibn ʿAsākir cites frequently. Now we turn to the big picture of how he says he a...
Ibn ʿAsākir names many persons from whom he acquired information for the TMD. What can our data tell us about them?
Digital humanists often say they would like to read more work in progress. Our blog posts represent such work. We worked intensively over months to create an...
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
The vast majority of texts in the OpenITI corpus were sourced from three major collections of digital texts originally prepared by organisations based in the...
It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...
We released a data set of an experiment conducted by Sarah Bowen Savant and Sohail Merchant, who wanted to understand potentially how, and from what, Shihāb ...
The 8th version (2023.1.8) of the OpenITI corpus is now available at Zenodo. The release is open access and is also accessible through our GitHub repository....
The KITAB team has released a new version (2022.2.7) of the OpenITI corpus at Zenodo. The release is open access. It is our seventh release (second release i...
The KITAB team has released a new version (2022.1.6) of the OpenITI corpus at Zenodo. The release is open access. It is our fifth release (second release in ...
The KITAB team has released a new version (2021.2.5) of the OpenITI corpus at Zenodo. The release is open access. It is our fifth release (second release in ...
The KITAB team has released a new version (2021.1.4) of the OpenITI corpus at Zenodo. The release is open access and freely available. It is our fourth relea...
A new version (version 2020.2.3) of the OpenITI corpus is available at Zenodo, an Open Science platform that supports open access. This is the third release ...
A new version of the corpus used by the KITAB team is now available to download at Zenodo, an Open Science platform that supports open access. This is the se...
Scholars working in Arabic can now download the entire corpus used by the KITAB team through Zenodo, an Open Science platform that supports open access.
Please join us for an online ‘Open House’ convened by the Centre for Digital Humanities at the Aga Khan University (International) in the United Kingdom and ...
A new version of KITAB’s text reuse data is now available to download at Zenodo, an Open Science platform that supports Open Access. The current release feat...
Post 7: Text Reuse Alignments
Antrim, Zayde, ‘Nostalgia for the Future: A Comparison between the Introductions to Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and al-Khaṭīb al-Baghdādī’s Taʾrīkh...
Image yourself as a learned bookseller of the twelfth century. You have just been called in to assess the estate of a wealthy, prominent scholar who has died...
As noted in the last post, we struggled to verify book citations in the TMD, both within and outside of isnāds. We believe that our struggles reflect the cha...
We continue our investigation of Ibn ʿAsākir’s citations to address our third question about his working methods. When author names appear within his isnāds,...
Our previous blog post featured a deep dive into the pool of informants whom Ibn ʿAsākir cites frequently. Now we turn to the big picture of how he says he a...
Ibn ʿAsākir names many persons from whom he acquired information for the TMD. What can our data tell us about them?
Digital humanists often say they would like to read more work in progress. Our blog posts represent such work. We worked intensively over months to create an...
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
Quantitative and macroanalytic approaches
One of the major challenges for those working with historical Arabic texts lies in names, and in the variety of ways that authors might refer to the same per...
The OpenITI corpus is designed to facilitate many different forms of computational analysis. Within the KITAB project we spend the bulk of our time fine-tuni...
Antrim, Zayde, ‘Nostalgia for the Future: A Comparison between the Introductions to Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and al-Khaṭīb al-Baghdādī’s Taʾrīkh...
Image yourself as a learned bookseller of the twelfth century. You have just been called in to assess the estate of a wealthy, prominent scholar who has died...
As noted in the last post, we struggled to verify book citations in the TMD, both within and outside of isnāds. We believe that our struggles reflect the cha...
We continue our investigation of Ibn ʿAsākir’s citations to address our third question about his working methods. When author names appear within his isnāds,...
Our previous blog post featured a deep dive into the pool of informants whom Ibn ʿAsākir cites frequently. Now we turn to the big picture of how he says he a...
Ibn ʿAsākir names many persons from whom he acquired information for the TMD. What can our data tell us about them?
Digital humanists often say they would like to read more work in progress. Our blog posts represent such work. We worked intensively over months to create an...
The OpenITI corpus contains more than 11,000 works and now exceeds 2 billion words in size. Many of the corpus’s works are extraordinarily large, surpassing ...
In part 1, I introduced you to the cluster data set, a second passim data set that is slightly different from the pairwise data set that the KITAB team use i...
It should be no surprise to any reader of this blog that the KITAB project is primarily interested in studying Arabic text reuse. A large number of posts her...
As KITAB’s research has shown, passim is an incredibly powerful tool for answering a variety of questions about book history and history in general. The algo...
At the Arabic Pasts conference this year, Hugh Kennedy and I presented a paper in the panel dedicated to the Invisible East programme, chaired by the program...
For us as digital historians and corpus curators, faced with the complex history of reception and transmission as well as the distinct approach to learning a...
Text reuse is the term that we use to describe cases where one book shares verbatim material with another. Text reuse can be studied manually through the rea...
It is not accidental that a large number of books in the OpenITI corpus belong to one important genre, prophetic Hadith – the sayings of the Prophet Muhammad...
Running the passim algorithm on the OpenITI corpus allows us to identify a vast number of instances of text reuse, but the quality of these results from a hi...
The digital revolution is arriving rather late to Middle Eastern studies, but it is coming fast.
With text reuse detection, we rely on the power, speed and memory of a computer to find common passages between texts.
Measuring variation in the early tradition
Much of our work at KITAB involves comparing books in order to understand their relationships. Our main tool for this is the passim software, which detects p...
Tagging the structure of the texts in OpenITI corpus is an important step towards the ultimate goal of the KITAB projectStudying the Arabic textual tradition...
At present, the OpenITI/KITAB corpus comprises 10,243 text files, 6,268 of which are unique titles.
One of the participants in our KITAB user group asked for an easy way to find out which are the most frequently used words in a text.
As KITAB’s research has shown, passim is an incredibly powerful tool for answering a variety of questions about book history and history in general. The algo...
The OpenITI corpus was designed in a way that makes it easy for scripts to access, identify and analyse the texts in the corpus. As a human reader, it was un...
Researchers working on historical Arabic texts have long known about transmission practices that resulted in considerable differences between what were osten...