Rik Koncel-Kedziorski , UW post-doc. SAGE is an algorithm for learning sparse representations of text you can read more about it here. In this paper, we propose three advancements for entity linking. Firstly, expanding acronyms can effectively reduce the ambiguity of the acronym mentions. We develop a model to identify ideological cues in political text. Hence, language is purposeful and strategic.

Jeju Island, Korea abstract bibtex paper slides We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes or parts of attributes. He is a research scientist at DeepMind. The Utility of Text: People write everyday — articles, blogs, emails — with a purpose and an audience in mind. The system builds upon the morphological analyzing capabilities of MeCab to incorporate finer details of classification such as politeness, tense, mood, voice attributes, and more. Mining user stances or viewpoints from these forums has been a popular research topic. Maarten Sap , UW Ph.

Yanchuan Sim, Noah A. However, most current work does not address an important problem: The full system achieves a competitive F-score 0.

dani yogatama thesis

We introduce a generative model positing latent topics and cross-cutting positions that gives special treatment to person mentions and opinion words. We demonstrate in experiments that our method has promising results on both micro-level and macro-level stance prediction.

CoMeT | Speaker Profile

Jeju Island, Korea abstract bibtex paper slides We present a joint probabilistic model of who cites whom in computational linguistics, and also of the words they use to do the citing. A basic Python script that handles compiling of LaTeX and related files. We evaluate our model and its components on two datasets collected from political blogs and sports news, finding that it outperforms a simple agglomerative clustering approach and previous work.


People write everyday — articles, blogs, emails — with a purpose and an audience in mind. In our selection strategy, an informative and diverse set of instances are selected for effective disambiguation.

Gregory Diamos – Et In Arcadia Ego

Hao PengUW Ph. Yanchuan Sim, Bryan R. Sarah DreierUW post-doc. He is a research scientist at DeepMind.

Current research has treated the above as disjoint problems. Furthermore, according to our error analysis, quite some errors are caused due to the different Wikipedia version is used, which hinder our system to show significant better performance.

dani yogatama thesis

Vancouver, BC, Canada abstract bibtex paper supplementary Online debate forums are important social media for people to voice their opinions and debate with each other.

We introduce a probabilistic dan of some of the important aspects of that process: The results show that Barack Obama, John McCain and Mitt Romney did indeed make substantively significant rhetorical shifts away from the ideological extremes after securing their parties’ presidential nominations. Nathan Schneider defended his Ph. The social attributes of an author thwsis influence and bias his language production; while authors are motivated to evoke responses from his audience. Extensive past work in quantitative political science provides a framework for empirically modeling the decisions of justices and how they relate to text.


He is a research scientist at Google. To accomplish this, we infer ideological cues from a corpus of political writings annotated with known ideologies.

hi. i am yanchuan.

Lastly, topic modeling is used to model the semantic topics of the articles. He is an assistant professor at the University of California at Berkeley.

dani yogatama thesis

Transductive learning from a few seeds and a collection of mention tokens combines Bayesian inference and conditional estimation. Variants of our model lead to improved predictive accuracy of citations given texts and texts given authors.

Jungo KasaiUW Ph. We apply a domain-informed Bayesian HMM to infer the proportions of ideologies each candidate uses in each campaign. SmithJing Jiang. David Bamman defended his Ph.

LTI PhD Thesis Defense: Dani Yogatama

I2R-NUS team submitted two results with the full system and the partial system for diagnosis purpose. Nikita HaduongUW Ph. Rik Koncel-KedziorskiUW post-doc.