“Toward Natural Language Inference Across Languages”
Location: LTS Auditorium, 8080 Greenmead Drive
Natural language processing tasks as diverse as automatically extracting information from text, answering questions, translating or summarizing documents, all require the ability to compare and contrast the meaning of words and sentences. State-of-the-art techniques rely on dense vector representations which capture the distributional properties of words in large amounts of text in a single language. We seek to improve these representations to capture not only similarity in meaning between words or sentences, but also inference relations such as entailment and contradiction, and enable comparisons not only within, but also across languages.
In this talk, we will present novel approaches to inducing word representations from multilingual text corpora. First, we will show that translations in e.g. Chinese can be used as distant supervision to induce English word representations that can be composed into better representations of English sentences (Elgohary and Carpuat, ACL 2016). Then we will show how sparsity constraints can further improve word
representations, and enable the detection of not only semantic similarity (do “cure’’ and “remedy’’ have the same meaning?), but also entailment (does “antidote’’ entail “cure’’?) between words in different languages (Vyas and Carpuat, NAACL 2016).
Marine Carpuat is an assistant professor of computer science at UMD with an appointment in UMIACS.
Her research interests are in natural language processing, with a focus on multilinguality.
Carpuat was previously a research scientist at the National Research Council of Canada and a postdoctoral researcher at the Columbia University Center for Computational Learning Systems.
She received her doctorate in computer science from the Hong Kong University of Science & Technology (HKUST) in 2008. She also earned an MPhil in electrical engineering from HKUST and an engineering degree from the French Grande Ecole Supélec.