GEDI - AGroundtruthing Environment for Document Images

TitleGEDI - AGroundtruthing Environment for Document Images
Publication TypeConference Papers
Year of Publication2010
AuthorsDoermann D, Zotkina E, Li H
Conference NameNinth IAPRInternational Workshop on Document Analysis Systems (DAS 2010)
Date Published2010///

In this paper, we describe a freely available highly configurable document image annotation tool called GEDI – Groundtruthing Environment for Document Images. Its basic structure involves two types of files, an Image file, and a corresponding .xml file in GEDI format. When users begin ground truthing an image, they can configure the interface to allow the creation of different types of zones, each of which may have a custom set of “attributes”. The output is compatible with the UMDDocLib architecture [2] and has been used in numerous funded and unfunded programs to create datasets in multiple languages. GEDI has been developed and released to the community as a comprehensive tool that we hope will ease the burden of document annotation and encourage additional sharing of data.