lu.se

Datavetenskap

Lunds Tekniska Högskola

Kalendarium

CS MSc Thesis Presentation Day March 11

Föreläsning

Tid: 2021-03-11 09:15 till 16:00
Plats: Online via zoom (separate link for each presentation)
Kontakt: birger [dot] swahn [at] cs [dot] lth [dot] se
Spara händelsen till din kalender


Five MSc theses to be presented on Thursday March 11, 2021

Thursday March 11 is a day for coordinated master thesis presentations in Computer Science at Lund University, Faculty of Engineering. Five MSc theses will be presented.

The presentations will take place online via Zoom, see separate link for each presentation. A preliminary schedule follows.

Note to potential opponents: Register as an opponent to the presentation of your choice by sending an email to the examiner for that presentation (firstname.lastname@cs.lth.se). Do not forget to specify the presentation you register for! Note that the number of opponents may be limited (often to two), so you might be forced to choose another presentation if you register too late. Registrations are individual, just as the oppositions are! More instructions are found on this page.


09:15-10:00

Presenters: Saam Mirghorbani, Viktor Claesson
Title: Selective GUI Testing Using Machine Learning
Examiner: Ulf Asklund
Supervisors: Emelie Engström (LTH), Linus Lindgren (Axis Communications AB), Magnus Walter (Axis Communications AB)

Regression testing is performed to minimize the risk of changes breaking existing functionality of software. Regression Test Selection (RTS) strategies aim to select a subset of only affected tests. However, in projects with frequent changes it is sometimes necessary to sacrifice some coverage for even shorter time of testing to receive feedback for every change. In this thesis we investigate how machine learning can be applied as an automated RTS strategy for GUI testing. We design and evaluate multiple solutions using historical data from 70,000 data points, including 4,000 code changes and 40 unique test cases. We evaluate how effective different factors are at determining if a test failure can be predicted, and how well the solution scales with the size of the project. We found that our best solution is able to outperform our best heuristic, and can be used to select a trade off between coverage and time.

Link to presentation: https://lu-se.zoom.us/j/69410781891

Link to popular science summary: To be updated


11:15-12:00

Presenters: Emil Aminy, Marcus Lissner
Title: Exploring the suitability of explainable AI for binary conflation of POI data
Examiner: Jacek Malek
Supervisors: Pierre Nugues (LTH), Frank Camarra (AFRY)

A point of interest, or POI, is a geographic location which someone could find useful or interesting, such as a restaurant or tourist attraction. Information regarding such locations exists in large quantities and can be sourced from numerous data aggregators and vendors. It is however not guaranteed that a source provides data that is complete and completely correct. Collecting and conflating data from multiple locations can improve the accuracy of such information. This is not a trivial task and can require complex systems to solve well, creating an uninterpretable black box of reasoning between input and output. In this thesis, we aim at exploring a solution which can conflate POI data from two different sources while still maintaining a transparent decision making process. To accomplish this, we employ decision trees, a highly explainable machine learning model. Our data consists of POI data gathered from Eniro, Foursquare, Hitta, OpenStreetMap and Tripadvisor. The results indicate that explainable AI models such as decision trees are a viable and useful tools for POI data conflation; achieving 60-90% accuracy, depending on which property is being conflated, while maintaining a high level of interpretability.

Link to presentation: https://lu-se.zoom.us/j/66042760139

Link to popular science summary: https://fileadmin.cs.lth.se/cs/Education/Examensarbete/Popsci/210311_11AminyLissner.pdf


13:15-14:00

Presenters: Henric Zethraeus, Philip Horstmann
Title: Evaluation of Active Learning Strategies for Multi-Label Text Classification
Examiner: Jacek Malec
Supervisors: Pierre Nugues(LTH) Michael Truong (Sinch AB)

With increasing data flows from cloud communication services, unlabeled data has become abundant, however, labeled data remains scarce. Active learning is an approach in machine learning where the model decides which samples to be labeled from a set of unlabeled instances. We initially investigate a suitable machine learning model for the task, and then further evaluate different active learning strategies. Our findings showed that a logistic regression model with an active learning strategy based on a minimum confidence, average, F1 macro score weighting created the best overall results with a fast learning rate and the highest F1 max score. This strategy queries samples with the lowest confidence, averaged over all class labels of each document along with weighting for the F1 macro scores of each label.

Link to presentation: https://lu-se.zoom.us/j/63230190226

Link to popular science summary: To be updated


14:15-15:00 (N.B. Presentation added to schedule later)

Presenter: Mathias Kindberg
Title: Investigating the Applicability of Deep Learning to Profile Ship Risk
Examiner: Elin A. Topp
Supervisors: Pierre Nugues (LTH), Johannes Hüffmeier (RISE), Luis Sánchez-Heres (RISE)

This thesis investigates the applicability of deep learning models on static and dynamic ship data from the maritime industry. Starting with a literature review of the current state of the art research, we try to bring the statistical conclusions and correlations found down to a single ship prediction using random forests and deep learning. The dataset collected is all Paris MoU Port State Control protocols from 2016 to 2020, ship data from Clarkson's research database and MRV (Monitoring, Reporting and Validation) EU emission data. With 72% accuracy, we can predict if the next Port State Control will be a detention on or not. The thesis concludes with that we find the same statistical signal in regards to vessel type etc., but have a hard time creating a model which is accurate enough on a sample basis. The report finishes by collecting lessons learned from working in this problem space and dataset. This can be applied to future improvements and research.

Link to presentation: https://lu-se.zoom.us/j/8715328781

Link to popular science summary: To be updated


15:15-16:00 (N.B. No more places for opposition for this presentation)

Presenter: Oscar Werneman
Title: Predicting bugs to reduce debugging time
Examiner: Per Runeson
Supervisors: Markus Borg (Rise), Daniel Hansson (Verifyter)

A major cost in software development pertains to finding and fixing bugs. Regression test suites of varying size run as often as possible to capture bugs. With the aid of logs, revision history, test results and planning, is it possible to utilise machine learning to speed up the debugging process? Instead of treating all revisions equally, creating a bug prediction model could rank revisions according to risk, thereby reducing the number in focus. Results show that although risk based verification is currently beyond reach, a bug prediction model can be used to reduce the time spent debugging.

Link to presentation: https://us04web.zoom.us/j/72816511855?pwd=bE4wejFrcjNBeHNybVV6MS83ZlM3Zz09

Link to popular science summary: To be updated