Lunds Tekniska Högskola

Denna sida på svenska This page in English

CS MSc Day January 15 Schedule!


Three MSc theses to be presented on January 15, 2016.

January 15 is the day for the 12th coordinated master thesis presentations in Computer Science at Lund University, Faculty of Engineering. Three MSc theses will be presented.

The presentations will take place in E-house, rooms E:1124, E:2405. A preliminary schedule follows.

Note to potential opponents: Register as opponent to the presentation of your choice by sending an email to the examiner for that presentation ( Do not forget to specify the presentation you register for! Note that the number of opponents may be limited (often to two), so you might be forced to choose another presentation if you register too late. Registrations are individual, just as the oppositions are! More instructions are found on this page.

E:1124 (changed from E:2405)


PRESENTERAnton Persson 
TITLEThe Shortcut Index 
EXAMINERFlavius Gruian (substituting for Per Andersson) 
SUPERVISORSKrzysztof Kuchcinski (LTH), Johan Svensson (Neo Technology) 
ABSTRACTWith a novel path index design, called the Shortcut Index, we partially solve the problem of executing traversal queries on dense neighborhoods in a graph database. We implement our design on top of the graph database Neo4j but it could be used for any graph database that uses the labeled property graph model.

By using a B+ tree, the Shortcut Index can achieve what we call neighborhood locality and range locality of paths. This means that data that belongs to the same part of the graph is located in the same space on disk. We empirically evaluate how this affect performance in terms of response time. Our experiments show that response time of the index scales very well with neighborhood density and percent of neighborhood interest when compared to Neo4j without the index.

We conclude that the Shortcut Index improves response time at a reasonable cost, especially in dense neighborhoods. 



PRESENTERSLisa Stenström, Olof Wahlgren
TITLEDiscovering and Inducing Rules to Categorize Sales Personnel 
EXAMINERJacek Malec 
SUPERVISORPierre Nugues 
ABSTRACTIn sales, it is presumed that the behavior of sales personnel differs depending on what part of sales they are in. However, to the best of our knowledge, there are no studies about conducting a segmentation of sales personnel based on behavioral data from Salesforce, the world’s largest CRM platform. Previous research describes how to segment different customers based on their behavioral data, but no one has yet attempted to segment sales personnel. 
In this thesis, we extracted Salesforce behavioral data about sales staff and clustered them into previously unknown segments. Using a mixture of supervised and unsupervised learning we created six profiles that describe how different sales personnel work in Salesforce. Our findings helped the company Brisk to improve their knowledge about sales personnel with a usefulness percentage of 73%. 


PRESENTERAdam Wamai Egesa 
TITLEAnalysis of Bank Transactions using Machine Learning 
EXAMINERJacek Malec (LTH) 
SUPERVISORSPierre Nugues (LTH), Marcus Klang (LTH) 
ABSTRACTThis thesis extends a system which could compute socio-ecological impact from categorized consumption to also work for uncategorized transactions. I.e. the extension enables users to upload and persist transactions to a document database to then compute the socio-ecological impact from those persisted transactions. The extension further includes visualizations on the system's web GUI using AngularJS and extension of the system's RESTful NodeJS API. 

To categorize transactions and thus be able to compute socio-ecological impact using the core system, a categorization service was created. The service was connected through a RabbitMQ message queue and trained supervised machine learning models using Apache Spark's machine learning library (MLlib) on a dataset containing about 1.4 million categorized transactions. This achieved a categorization accuracy of 80%.

The main focus for future work is to increase accuracy by using named-entity recognition and splitting up the categorization into two steps using multiple types of categorizers.