Friday, May 13, 2011
3:30 - 5:00 p.m.
Paccar Hall 291
Mona Diab
CCLS, Columbia University
AUTOMATIC PROCESSING OF ARABIC(S)
Spoken by over 300m people, Arabic is considered one of the languages of significant importance for NLP -- in particular for Machine Translation and Multilingual processing.
In this talk, I will explore the depths and breadth of the challenges that Arabic poses to NLP, due to dearth of resources, high variability, lack of writing standards for dialects, and lack of sufficient understanding of the dialectal phenomena.
Mona Diab is a Research Scientist at the Center for Computational Learning Systems (CCLS) and an Adjunct Associate Professor in the Computer Science Department at Columbia University.
Gina-Anne Levow
Linguistics Department, UW
PREDICTING VERBAL FEEDBACK ACROSS CULTURES IN FACE-TO-FACE CONVERSATION
Fluent conversation involves complex, multi-modal coordination among participants, although most conversants accomplish it with little effort. In cross-cultural settings, this coordination can prove more difficult to achieve. In this talk, I focus on automatic prediction of verbal feedback, one component of this process, across three language/cultural groups: American English, Mexican Spanish, and Iraqi Arabic. I identify key challenges due to language-specific differences, inter-speaker variation, and the relative sparseness and optionality of verbal feedback. Our approach addresses these challenges through a machine learning regime and exploits prosodic features, including pitch, intensity, and duration, to dramatically improve prediction of verbal feedback. Feature analysis identifies both similarities and contrasts across languages.
Reception will follow in the same location