Research Problem:
The high-level goal of this multi-site research project (led by SRI) is to improve machine translation performance on less formal genres such as web forum data, blogs, microblogs, chat, conversational speech, etc. The focus of the UW team, led by Prof. Mari Ostendorf in EE, is on translating Mandarin Chinese to English, looking at methods for normalizing informalities and idioms as well as incorporating multi-sentence context for resolving ambiguities. Challenges for an undergraduate researcher may include: analysis of impact of tokenization strategies on word alignment between languages, discovery of error patterns between gold reference English sentences and hypothesis English sentences translated from Chinese, and unsupervised learning of style/formality differences in forum text.
Desired Skills/Background:
1, Familiar with linux and shell scripting
2, Familiar with one or more of the following scripting languages: perl, python, and ruby
3, Experience in automatic text processing preferred
4, Ability to read simplified Chinese is a plus
For more information or to submit a resume, please contact Bin Zhang (binz@u.washington.edu). Bin will meet with all candidates, and he will select the most qualified to talk with Prof. Ostendorf.