lunduniversity.lu.se

Computer Science

Faculty of Engineering, LTH

Denna sida på svenska This page in English

EDAN20 – Language Technology

EDAN20 – Language Technology

This page is provisional and is being constantly updated

The first lecture will be held on August 28, 2023, 13-15.

The course has three pages:

  1. The official course page, https://cs.lth.se/edan20/;
  2. Canvas, used mainly to hand in assignments and manage communication with the participants;
  3. GitHub, where I store the labs descriptions https://github.com/pnugues/edan20 and the programs used in the course https://github.com/pnugues/ilppp.

The complete schedule for August/September/October 2023 is available here: https://cloud.timeedit.net/lu/web/lth1/ri1Q5006.html. Use the course schedule or lässchema links and enter EDAN20 to view the course schedule.

Course Delivery

This year, all the lecture classes or labs will be physical. We will not use Zoom (normaly).

Course Registration

You need to attend the first lecture to be sure to keep your seat (or email the course leader before the course starts). You should formally register in the LADOK system, preferably before the course starts. You have the instructions here in Swedish and in English. If you fail to register, the department might register you anyway, if you attended the first lecture and meet all the requirements.

Lab Registration

You have to register to a laboratory group. Please do it here: https://sam.cs.lth.se/LabsSelectSession?occasionId=821

We will also use Discord for the lab sessions.

Objectives

The course introduces theories and techniques of natural language processing and language technology. It attempts to cover the whole field from speech recognition and synthesis to semantics and dialogue.

It focuses on industrial or laboratory applications, such as document retrieval on the Internet, information extraction, conversational agents, and verbal interaction in virtual worlds. Fundamental algorithms will be described using Python.

Course contents

  • An overview of language processing: presentation of language processing, applications, disciplines of linguistic, examples
  • Corpus and word processing: regular expressions, automata, an introduction to Python, concordances, tokenization, counting words, collocations
  • Morphology and part-of-speech tagging: morphology, transducers, part-of-speech tagging,
  • Prolog to write phrase-structure grammars: constituents, trees, using Prolog to do natural language analysis, DCG rules, variables, getting the syntactic structure, compositional analysis to get the semantic structure.
  • Syntactic formalisms: constituency and dependency, chart parsing, statistical parsing, functions, dependency parsing.
  • Semantics: formal semantics, lambda-calculus, compositionality: nouns, verbs, determiners, words and meaning, lexical semantics, case grammars, semantic grammars
  • Discourse and dialogue: discourse and rhetoric, anaphora, structure, RST, dialogue: automata, pairs, speech acts, multimodality.
  • Overview of speech synthesis and speech recognition

Textbook

As textbook, I will use:Language processing with Perl and Prolog, 2nd edition, 2014, Springer. It is available from Springer link: [html] [pdf], or in a paper version [html].

I started to write a 3rd edition with Python instead of Perl. Unfortunately, on August 15, 2016, I had a work accident at LTH: Workers demolished the window of my office while I was working and without warning me. Since then, I have a very debilitating tinnitus (ringing hears). This new edition will be considerably delayed (if I can ever publish it). I will nonetheless hand out a draft of the chapters I have written.

Students can also use the first edition from 2006,An Introduction to Language Processing with Perl and Prolog. The electronic version is available for free: [pdf]. You need to be logged from Lund University accounts to have a free copy. The paper version of the first edition costs 25 euros: [html].

Ges: Läsperiod HT1

Kontaktperson: Pierre Nugues

Förkunskapskrav: Se kursplanen.

OBS! Kursen ges på engelska

Kurswebb:http://cs.lth.se/edan20

Page Manager:

Facts about the course

EDAN20: Language Technology

Higher education credits:7.5

Grading scale:TH — (U, 3, 4, 5)

Level:A

Language of instruction:The course might be given in English

Course coordinator:Pierre Nugues

E-mail: Pierre.Nugues@cs.lth.se

Prerequisites:EDAA01 Programming — Second Course or EDA027 Algorithms and Data Structures

Admission specifics:

Assessment:Compulsory course items: Six assignments. Optional examination.

Home page:cs.lth.se/edan20

Further information/Transitional rules:Limited number of participants