ODIN stands for the Online Database of Interlinear Text. It is a collection of interlinear glossed text (IGT) instances extracted from linguistic documents on the Web.
As of version 2.0, ODIN is distributed in the Xigt format (as well as text) and is licensed under the Creative Commons CC-BY 4.0 license. Version 2.1 includes enriched data from the INTENT project, as well as numerous improvements to the cleaning and normalization of the original data. A flowchart describing how INTENT enriches IGTs is available here.
Version | Date | Description | ||
---|---|---|---|---|
v2.1 | 2016-03-14 | IGT instances in the plain text format and in the Xigt format, as well as Xigt data enriched by INTENT. Contains 158,007 IGT instances from 2,027 documents covering 1,496 languages. | Download | Changelog Readme |
v2.0 | 2014-07-05 | IGT instances in the plain text format and the Xigt format. Contains 158,007 IGT instances from 2,027 documents covering 1,496 languages. | Download | Changelog |
v1.0 | First release. A GUI search interface is hosted by The LINGUIST List website | View |
If you make use of ODIN in your research, please cite the following papers:
The following Icelandic [isl] example is from:
Sigurðsson, Halldór Ármann. "The Icelandic Noun Phrase: Central Traits." Arkiv för nordisk filologi 121 (2006): 193-236. [pdf]
The example has been converted into the Xigt format and enriched by INTENT. Not all annotations are shown; the original XML file is here. The example is visualized with the XigtViz IGT renderer. Interlinear annotations are shown in columns, and all annotations can be seen by hovering your mouse cursor over an item. The immediate target of annotation has a blue border, while ancestors are lightly shaded.
Work on ODIN (and related projects) has been funded in part by the following grants: