lunduniversity.lu.se

Computer Science

Faculty of Engineering, LTH

Denna sida på svenska This page in English

EDAN20 – Language Technology

EDAN20 – Language Technology

This page is provisional is is being constantly updated

This year, we will have no physical lecture classes or labs. We will be using Zoom for the whole course.

The schedule for August/September/October 2020 is available here: https://cloud.timeedit.net/lu/web/lth1/ri1Q5006.html. Use the course schedule or lässchema links and enter EDAN20 to view the course schedule.

Registration
You need to attend the first lecture to be sure to keep your seat (or email the course leader before the course starts). You should formally register in the LADOK system, preferably before the course starts. You have the instructions here in Swedish and in English. If you fail to register, the department might register you anyway, if you attended the first lecture.

The first lecture will be held on August 31, 2019, 15-17. We will use Zoom. Please install it before the the lecture starts.

The course takes place in the HT1 “study period.”

You can register to a laboratory group here: https://sam.cs.lth.se/LabsSelectSession?occasionId=656

Objectives

The course introduces theories and techniques of natural language processing and language technology. It attempts to cover the whole field from speech recognition and synthesis to semantics and dialogue.

It focuses on industrial or laboratory applications, such as document retrieval on the Internet, information extraction, conversational agents, and verbal interaction in virtual worlds. Fundamental algorithms will be described using Python.

Course contents

  • An overview of language processing: presentation of language processing, applications, disciplines of linguistic, examples
  • Corpus and word processing: regular expressions, automata, an introduction to Python, concordances, tokenization, counting words, collocations
  • Morphology and part-of-speech tagging: morphology, transducers, part-of-speech tagging,
  • Prolog to write phrase-structure grammars: constituents, trees, using Prolog to do natural language analysis, DCG rules, variables, getting the syntactic structure, compositional analysis to get the semantic structure.
  • Syntactic formalisms: constituency and dependency, chart parsing, statistical parsing, functions, dependency parsing.
  • Semantics: formal semantics, lambda-calculus, compositionality: nouns, verbs, determiners, words and meaning, lexical semantics, case grammars, semantic grammars
  • Discourse and dialogue: discourse and rhetoric, anaphora, structure, RST, dialogue: automata, pairs, speech acts, multimodality.
  • Overview of speech synthesis and speech recognition

Textbook

As textbook, I will use:Language processing with Perl and Prolog, 2nd edition, 2014, Springer. It is available from Springer link: [html] [pdf], or in a paper version [html].

I started to write a 3rd edition with Python instead of Perl. Unfortunately, on August 15, 2016, I had a work accident at LTH: Workers demolished the window of my office while I was working and without warning me. Since then, I have a very debilitating tinnitus (ringing hears). This new edition will be considerably delayed (if I can ever publish it). I will nonetheless hand out a draft of the chapters I have written.

Students can also use the first edition from 2006,An Introduction to Language Processing with Perl and Prolog. The electronic version is available for free: [pdf]. You need to be logged from Lund University accounts to have a free copy. The paper version of the first edition costs 25 euros: [html].

Ges: Läsperiod HT1

Kontaktperson: Pierre Nugues

Förkunskapskrav: Se kursplanen.

OBS! Kursen ges på engelska

Kurswebb: http://cs.lth.se/edan20

Page Manager:

Facts about the course

EDAN20: Language Technology

Higher education credits: 7.5

Grading scale: TH — (U, 3, 4, 5)

Level: A

Language of instruction: The course might be given in English

Course coordinator: Pierre Nugues

E-mail: Pierre.Nugues@cs.lth.se

Prerequisites: EDAA01 Programming — Second Course or EDA027 Algorithms and Data Structures

Admission specifics: The number of participants is limited to 58

Assessment: Compulsory course items: Six assignments. Optional examination.

Home page: cs.lth.se/edan20

Further information/Transitional rules: Limited number of participants