SNAM 2013 Abstracts

Short Papers
Paper Nr: 2

Every Character Counts - A Character based Approach to Determine Political Sentiment on Twitter


Anastasios Dimas, Panagiotis Kokkinos and Emmanouel Varvarigos

Abstract: The rising popularity of social networking platforms, has transformed them into a valuable source of information. Sentiments or opinions expressed through the posts of users can be extracted and applied for various purposes. Information such as determining the political preference (e.g., Republican or Democrat) of a user can be useful, for example, in conducting opinion polls, especially around election time. In this work, we apply two different methodologies for sentiment analysis on Twitter posts and demonstrate the superiority of character based approaches over word based ones in determining political sentiment.

Paper Nr: 6

A Generic Open World Named Entity Disambiguation Approach for Tweets


Mena B. Morgan and Maurice van Keulen

Abstract: Social media is a rich source of information. To make use of this information, it is sometimes required to extract and disambiguate named entities. In this paper, we focus on named entity disambiguation (NED) in twitter messages. NED in tweets is challenging in two ways. First, the limited length of Tweet makes it hard to have enough context while many disambiguation techniques depend on it. The second is that many named entities in tweets do not exist in a knowledge base (KB). We share ideas from information retrieval (IR) and NED to propose solutions for both challenges. For the first problem we make use of the gregarious nature of tweets to get enough context needed for disambiguation. For the second problem we look for an alternative home page if there is no Wikipedia page represents the entity. Given a mention, we obtain a list of Wikipedia candidates from YAGO KB in addition to top ranked pages from Google search engine. We use Support Vector Machine (SVM) to rank the candidate pages to find the best representative entities. Experiments conducted on two data sets show better disambiguation results compared with the baselines and a competitor.