Skip to main content

Events

03

June

CS MSc Thesis Presentations 3 June 2024

From: 2024-06-03 14:15 to 16:00 Föreläsning

Two Computer Science MSc theses to be presented on 3 June

Monday, 3 June there will be two master thesis presentations in Computer Science at Lund University, Faculty of Engineering.

The presentations will take place in E:2116 and E:4130 (Lucas). See information for each presentation.

Note to potential opponents: Register as an opponent to the presentation of your choice by sending an email to the examiner for that presentation (firstname.lastname@cs.lth.se). Do not forget to specify the presentation you register for! Note that the number of opponents may be limited (often to two), so you might be forced to choose another presentation if you register too late. Registrations are individual, just as the oppositions are! More instructions are found on this page.


14:15-15:00 in E:2116 and via Zoom (see link below)

Presenter: Julia Bäcklund
Title: Exploring AI-Assisted Software Development at Scania: The Role of Prompt Engineering and Regulatory Compliance
Examiner: Emma Söderberg
Supervisors: Markus Borg (LTH), Maria Erman (Scania CV AB)

Generative AI is revolutionizing industries worldwide, and no company wants to miss the innovation train. This thesis investigates the potential of AI-assisted development at Scania, focusing on enhancing code generation through prompt engineering and ensuring compliance with the EU AI Act. The study employs a methodology containing three phases: initial exploration of AI integration at Scania, testing of various prompt engineering techniques using an AI-assistant from Azure AI Studio, and an analysis of the broader implications, including regulatory compliance with the EU AI Act. The research identifies opportunities for AI to enhance efficiency, particularly through the use of generative AI in code generation. Among various prompt engineering techniques evaluated, Few-shot, Hybrid, Chain of Thought (CoT), and Least to Most Prompting emerge as the most effective in enhancing the accuracy and utility of generated code. These techniques prove important in optimizing the performance of Azure AI Studio’s AI-assistant across a series of dynamic programming problems, highlighting the potential for tailored AI implementations to meet specific organizational needs while adhering to strict security and privacy standards. Furthermore, the study explores the implications of the EU AI Act, investigating the need for companies to align AI deployments with forthcoming regulations, particularly in high-risk applications such as autonomous vehicles—relevant for Scania’s industry. The findings suggest that while the AI-assistant used for code generation falls outside the direct scope of high-risk AI systems, its implementation must still prioritize transparency, data governance, and user trust to comply with broader regulatory and ethical standards. In conclusion, the thesis demonstrates that prompt engineering can enhance the capability of AI-assistants in software development. Future work should continue to refine these techniques and explore their applicability in other areas of AI-assisted development.

Link to popular science summary: To be uploaded

Zoom link to presentation: https://lu-se.zoom.us/j/69344716309


15:15-16:00 in E:4130 (Lucas)

Presenters: Lycke Fureby, Filippa Hansen
Title: Domain adaptation of retrieval systems from unlabeled corpuses in a RAG setting Comprehension
Examiner: Elin A. Topp
Supervisors: Patrik Edén (LTH/LU), Svante Sörberg (ModulaiAB), Dmitrijs Kass (ModulaiAB)

Retrieval Augmented Generation (RAG) combines Large Language Models (LLM) with an Embedding Model based Information Retrieval System, enhancing traceability and accuracy for domain-specific tasks. This integration allows for quick query responses, crucial for professional environments. Our research focused on various text-splitting techniques, synthetic data generation, LLM-based data processing including hard negative mining, and multiple fine-tuning methods. Evaluations on both synthetic and human-annotated data aimed to optimize domain adaptation, with additional analysis on the influence of training data size and diversity. The greatest performance increases were observed when fine-tuning with Multiple Negatives Ranking Loss on smaller datasets. Enhancements continued with increased and diversified training data. Peak performance was achieved using an LLM for mining hard negatives, which expanded the dataset for fine-tuning with Online Contrastive Loss. Overall, the fine-tuned model on synthetic data demonstrated statistically significant performance gains over both synthetic and human-annotated queries and outperformed a larger open-source model.

Link to popular science summary: To be uploaded

 



Om händelsen
From: 2024-06-03 14:15 to 16:00

Plats
E:2116 and E:4130 (Lucas)

Kontakt
birger [dot] swahn [at] cs [dot] lth [dot] se