TitleA modality lexicon and its use in automatic tagging
Publication TypeJournal Articles
Year of Publication2010
AuthorsBaker K, Bloodgood M, Dorr BJ, Filardo NW, Levin L, Piatko C
JournalProceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10)
Pagination1402 - 1407
Date Published2010///

This paper describes our resource-building results for an eight-week JHU Human Language Technology Center of Excellence SummerCamp for Applied Language Exploration (SCALE-2009) on Semantically-Informed Machine Translation. Specifically, we describe the
construction of a modality annotation scheme, a modality lexicon, and two automated modality taggers that were built using the lexicon
and annotation scheme. Our annotation scheme is based on identifying three components of modality: a trigger, a target and a holder.
We describe how our modality lexicon was produced semi-automatically, expanding from an initial hand-selected list of modality trigger
words and phrases. The resulting expanded modality lexicon is being made publicly available. We demonstrate that one tagger—a
structure-based tagger—results in precision around 86% (depending on genre) for tagging of a standard LDC data set. In a machine
translation application, using the structure-based tagger to annotate English modalities on an English-Urdu training corpus improved the
translation quality score for Urdu by 0.3 Bleu points in the face of sparse training data.