lunduniversity.lu.se

Computer Science

Faculty of Engineering, LTH

Denna sida på svenska This page in English

2014 projects

Projects 2014

EDAN70: Project in Computer Science

Projects in language technology 2014

Meris Bahtijaragic, Tim Dolck, and Julian Kroné,
Minerva Question-answering system [pdf]
Abstract: In this paper a question answering system including passage retrieval and answer extraction is described and evaluated. The question answering system described is designed for Swedish questions and answers, and uses Swedish Wikipedia as its data source. The system is trained using transcribed questions from the Swedish board game Kvitt eller dubbelt. These questions are also used for testing. The main tools used when developing the system were Apache Lucene, Stagger and Liblinear.

Firas Dib and Simon Lindholm,
Career Timelines Extracted from Wikipedia [pdf]
Abstract: We have created a system that will, given a persons Wikipedia article page, create a timeline of their work experience. We do this by connecting persons to professions through analysis of part-of-speech and dependency relations. We did not achieve any worthwhile results parsing the dates which is why we have not taken dates into account when calculating recall, precision or F-score. The process has a recall of 74% and a precision of 95%, resulting in a F-score of 83%.

Mattias Eklund, Christian Frid, and Adam Wamai Egesa,
Reranking using Supervised Learning in a Question Answering System [pdf]
Abstract: This work will present the result and findings of the project in Language Technology that were conducted by the authors of this paper. The project consisted of the implementation of a Question Answering System for Swedish using the Swedish version of Wikipedia as the knowledgebase. The system reranks results from a traditional passage retrieval of the top nouns using supervised learning. Testing of the system showed 69% of the questions could be answered with reranking improving the average correct answer’s rank noticeably. Improving the median from 43 to 34.

Magnus Norrby,
Extraction of lethal events [pdf]
Abstract: This article describes techniques for extracting information about persons from Wikipedia. The information searched for includes the persons cause of death, origin and profession. The extraction is done using part-of-speech tagging and dependency parsing. Data is also extracted from Wikidata using a JSON parser and combined with the Wikipedia data. The resulting data is then inserted into a MySQL database for further analysis.

Alexander Wallin,
Sentiment analysis of Amazon reviews and perception of product features [pdf]
Abstract: We investigate models based on sentiment analysis based on Amazon reviews and their application on reviews from other sources using a bag-of-words model with weights calculated using logistic regression. We examine different methods for adjusting unbalanced datasets as well as the qualitative performance of different features such as unigram and bigrams when applied to reviews from different sources. We also present a method for adjusting entity weights when making quantitative presentations of the polarity of nouns. We found that bigrams when used in conjunction with unigrams had higher precision when applied cross-domain than models with other combinations of features, even though the models had comparative performance in-domain. We also found that polarity deduction is an import tool for sentiment analysis, even though the method described in this paper performed poorly.