Bonnie Dorr

Bonnie J. Dorr is a professor emeritus of computer science.
She was formerly associate dean of the College of Computer, Mathematical, and Natural Sciences. With Amy Weinberg and Louiqa Raschid, who also holds an appointment in UMIACS, she co-founded the Computational Linguistics and Information Processing Laboratory, where she served as co-director for 15 years.
Dorr was also principal scientist for two years at the Johns Hopkins University Human Language Technology Center of Excellence. Her research spans several areas of broad-scale multilingual processing, including machine translation, summarization, and cross-language information retrieval. In 2011, she began service as program manager at DARPA, where she is responsible for programs in the area of human language technology, while continuing to mentor students and postdoctoral researchers on projects in computer science and UMIACS.
Dorr is a Sloan Fellow, a National Science Foundation Presidential Faculty (PECASE) Fellow, and a former President of the Association for Computational Linguistics (2008).
She holds a B.S. in computer science from Boston University, and a S.M. and a doctorate in computer science from the Massachusetts Institute of Technology.
Go here to view Dorr's academic publications on Google Scholar.
Publications
2012
2012. Modality and Negation in SIMT Use of Modality and Negation in Semantically-Informed Syntactic MT. Computational Linguistics. :1-48.
2011
2011. Evaluating visual and statistical exploration of scientific literature networks. 2011 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). :217-224.
2011. Machine Translation Evaluation and Optimization. Handbook of Natural Language Processing and Machine TranslationHandbook of Natural Language Processing and Machine Translation. :745-843.
2011. Rapid understanding of scientific paper collections: integrating statistics, text analysis, and visualization. University of Maryland, Human-Computer Interaction Lab Tech Report HCIL-2011.
2010
2010. Interlingual Annotation of Parallel Text Corpora: A New Framework for Annotation and Evaluation. Natural Language Engineering. 16(03):197-243.
2010. iOpener Workbench: Tools for rapid understanding of scientific literature. Human-Computer Interaction Lab 27th Annual Symposium, University of Maryland, College Park, MD.
2010. A modality lexicon and its use in automatic tagging. Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10). :1402-1407.
2010. Putting the user in the loop: interactive Maximal Marginal Relevance for query-focused summarization. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. :305-308.
2010. Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods. Computational Linguistics. 36(3):341-387.
2009
2009. A cost-effective lexical acquisition process for large-scale thesaurus translation. Language resources and evaluation. 43(1):27-40.
2009. Symbolic-to-statistical hybridization: extending generation-heavy machine translation. Machine Translation. 23(1):23-63.
2009. TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate. Machine Translation. 23(2):117-127.
2009. Using citations to generate surveys of scientific paradigms. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. :584-592.
2009. Semantically informed machine translation (SIMT). SCALE summer workshop final report, Human Language Technology Center Of Excellence.
2009. Fluency, adequacy, or HTER?: exploring different human judgments with a tunable MT metric Proceedings of the Fourth Workshop on Statistical Machine Translation. :259-268.
2009. Cross-document coreference resolution: A key technology for learning by reading. AAAI Spring Symposium on Learning by Reading and Learning to Read.
2009. Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2. :599-608.
2009. Generating surveys of scientific paradigms. Proceedings of HLT-NAACL.
2009. Interlingual annotation of multilingual text corpora and FrameNet. Multilingual FrameNets in Computational LexicographyMultilingual FrameNets in Computational Lexicography. 200:287-318.
2008
2008. Multiple alternative sentence compressions and word-pair antonymy for automatic text summarization and recognizing textual entailment. Proceedings of the Text Analysis Conference (TAC-2008), Gaithersburg, MD.
2008. TERp system description. MetricsMATR workshop at AMTA.
2008. The acl anthology reference corpus: A reference dataset for bibliographic research in computational linguistics. Proc. of the 6th International Conference on Language Resources and Evaluation Conference (LREC’08). :1755-1759.
2008. Combining open-source with research to re-engineer a hands-on introductory NLP course. Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics. :71-79.
2008. Applying automatically generated semantic knowledge: A case study in machine translation. NSF Symposium on Semantic Knowledge Discovery, Organization and Use.
2008. Language and translation model adaptation using comparable corpora. Proceedings of the Conference on Empirical Methods in Natural Language Processing. :857-866.
2008. Single-document and multi-document summarization techniques for email threads using sentence compression. Information Processing & Management. 44(4):1600-1610.
2008. Are multiple reference translations necessary? investigating the value of paraphrased reference translations in parameter optimization Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, October.
2008. Computing word-pair antonymy. Proceedings of the Conference on Empirical Methods in Natural Language Processing. :982-991.
2007
2007. Multi-candidate reduction: Sentence compression as a tool for document summarization tasks. Information Processing & Management. 43(6):1549-1570.
2007. Multiple alternative sentence compressions for automatic text summarization. Proceedings of DUC.
2007. Task-based evaluation of text summarization using Relevance Prediction. Information Processing & Management. 43(6):1482-1499.
2007. Exploiting aspectual features and connecting words for summarization-inspired temporal-relation extraction. Information Processing & Management. 43(6):1681-1704.
2007. Using paraphrases for parameter tuning in statistical machine translation. Proceedings of the Second Workshop on Statistical Machine Translation. :120-127.
2007. Combining outputs from multiple machine translation systems. Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference. :228-235.
2007. TREC 2007 ciQA Task: University of Maryland. Proceedings of TREC.
2007. Using paraphrases for parameter tuning in statistical machine translation. Proceedings of the Second Workshop on Statistical Machine Translation. :120-127.
2007. Measuring variability in sentence ordering for news summarization. Proceedings of the Eleventh European Workshop on Natural Language Generation. :81-88.
2006
2006. Reranking for Sentence Boundary Detection in Conversational Speech. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 1:I-I-I-I.
2006. Cross-Language Access to Recorded Speech in the MALACH Project. Text, Speech and DialogueText, Speech and Dialogue. 2448:197-212.
2006. Challenges in building an Arabic-English GHMT system with SMT components. Proceedings of the 11th Annual Conference of the European Association for Machine Translation (EAMT-2006). :56-65.
2006. A maximum entropy approach to combining word alignments. Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. :96-103.
2006. Parallel syntactic annotation of multiple languages. Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC2006). Genoa, Italy.
2006. Annotation compatibility working group report. Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora 2006. :38-53.
2006. PCFGs with syntactic and prosodic indicators of speech repairs. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. :161-168.
2006. Sentence compression as a component of a multi-document summarization system. Proceedings of the 2006 Document Understanding Workshop, New York.
2006. Sentence Trimming and Selection: Mixing and Matching. DUC 06 Conference Proceedings.
2006. Machine Translation: Interlingual Methods. Encyclopedia of Language & Linguistics (Second Edition)Encyclopedia of Language & Linguistics (Second Edition). :383-394.
2006. Automatic identification of confusable drug names. Artificial Intelligence in Medicine. 36(1):29-42.
2006. Leveraging recurrent phrase structure in large-scale ontology translation. Proceedings of the 11th Annual Conference of the European Association for Machine Translation.
2006. SParseval: Evaluation metrics for parsing speech. Proc. LREC.
2006. Opinion Analysis in Document Databases. Proc. AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs, Stanford, CA.
2006. A study of translation edit rate with targeted human annotation. Proceedings of Association for Machine Translation in the Americas. :223-231.
2006. Leveraging reusability: cost-effective lexical acquisition for large-scale ontology translation. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. :945-952.
2006. Going beyond AER: an extensive analysis of word alignments and their impact on MT. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. :9-16.
2005
2005. NeurAlign: combining word alignments using neural networks. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. :65-72.
2005. Alignment link projection using transformation-based learning. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. :185-192.
2005. Iterative translation disambiguation for cross-language information retrieval. Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. :520-527.
2005. A methodology for extrinsic evaluation of text summarization: Does ROUGE correlate. Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. :1-8.
2005. Finite-State Language resources and Language Processing. Machine Translation. 18:381-382.
2005. Johns Hopkins summer workshop final report on parsing and spoken structural event detection. Johns Hopkins University, Tech. Rep.
2005. Frame semantic enhancement of lexical-semantic resources. Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition. :57-66.
2005. A sentence-trimming approach to multi-document summarization. Proceedings of DUC2005.
2005. Umd/bbn at mse2005. Proceedings of the MSE2005 track of the Association for Computational Linguistics Workshop on Intrinsic and Extrinsic Evaluation Meatures for MT and/or Summarization, Ann Arbor, MI.