What's New

Using the many to spot the few

At present, the OpenITI / KITAB corpus comprises 10,243 text files, 6,268 of which are unique titles.  Such a large, and growing, number of texts makes quality control challenging. But at the same time, it is precisely this large number of texts that can be the basis...

read more

A Token Frequency Counter For OpenITI Texts

One of the participants in our KITAB user group asked for an easy way to find out which are the most frequently used words in a text. There are quite a lot of online tools that allow you to upload a file (or provide a link to a web page) and will produce a nice table...

read more
Share this: