Information-Theoretic Modeling

582650 Information-Theoretic Modeling (4 cr)
Department of Computer Science, University of Helsinki

An advanced course in the Algorithms and Machine Learning sub-programme. Students in other programmes, as well as students in mathematics and statistics are welcome too.

This course provides an introduction to information theory from the computer science perspective. Much of the course can be viewed as consequences of Shannon's central result known as the Noiseless Source Coding Theorem. The theoretical results will be illustrated by various practical data compression systems from Huffman coding to Rissanen's arithmetic coding. In order to demonstrate the wide applicability of information-theoretic concepts in intelligent systems, we discuss information-theoretic principles in statistical modeling, and especially the Minimum Description Length (MDL) principle.

Jorma Rissanen, the inventor of the MDL principle (left) receiving a best paper award from Claude Shannon, the father of information theory.

Note: Since the contents of this course overlap with the earlier Three Concepts: Information (3CI) course, it is not possible to include both of them in your degree. This also applies to the project.

Lecturer: Teemu Roos, A332, teemu.roos at cs.helsinki.fi
Teaching assistant: Anupam Arohi, anupam.arohi at cs.helsinki.fi

Language: All course material and lectures will be in English.

Lectures: Period I: 8.9.–16.10. Tue & Fri 10–12 in C222 (except 25.9. in CK112). All the lectures will be recorded on video and made available the following day.

Exercises: 18.9.–16.10. Fri 12–14 in C222.

Final exam: Tue 20.10. 9-12AM in B123

The course is followed by a related project in Period II. It is recommended that you take both the course and the project. (More information will be posted later.)

Prerequisites:

Elementary calculus (limits and convergence, convexity)
Basic probability (random variables, conditional and joint distributions, expectation)
Good programming skills (no specific language required)

The schedule below is tentative and subject to change.

If you don't have a high-speed connection, you may prefer the downloadable versions of the videos.

Lecture 1. (Tue 8.9.): Introduction. participation obligatory
• introduction to the course • overview of contents
slides video exercises #1
Lecture 2. (Fri 11.9.): Noisy Channel Coding.
• error correcting codes • noisy channel coding theorem • Hamming codes
slides video
Lecture 3. (Tue 15.9.): Mathematical Preliminaries.
• calculus • probability • law of large numbers • Jensen's and Gibbs' inequalities
slides video exercises #2 | files
Lecture 4. (Fri 18.9.): Source Coding: Theory.
• entropy and information • asymptotic equipartition property (AEP) • noiseless source coding theorem
slides video
Lecture 5. (Tue 22.9.): Source Coding: Practice.
• symbol codes • Kraft inequality • entropy coding
slides video exercises #3
Lecture 6. (Fri 25.9.): Special Guest Lecture: Daniel Schmidt
• Minimum Message Length (MML) principle
slides video
Lecture 7. (Tue 29.9.): Source Coding: Practice (contd.).
• Shannon-Fano code • Huffman code • arithmetic coding
slides video exercises #4
Lecture 8. (Fri 2.10.): Universal Source Coding.
• two-part codes • mixture codes • normalized maximum likelihood
slides video
Lecture 9. (Tue 6.10.): MDL Principle.
• Occam's razor • old-style & modern MDL
slides | more slides video exercises #5 | data
Lecture 10. (Fri 9.10.): MDL Principle (contd.).
• example applications
slides video
Lecture 11. (Tue 13.10.): Further Topics
• Kolmogorov complexity • gambling • lossy coding • etc.
slides video
Lecture 12. (Fri 16.10.): Redundant Lecture
• summary of course contents • questions and answers before final exam • introduction to project
video

Additional Material (by Lecture)

Here you can find recommended (but not required) additional material, categorized according to (our) lectures.

Lecture 2. Noisy Channel Coding

Information and Entropy: an MIT OpenCourseWare elementary level course • for material by topic, click here
⇒ material on noise and errors

Lecture 4. Source Coding: Theory

Cover & Thomas, Elements of Information Theory, Chapter 2: (see course folder, C127)

Claude E. Shannon, "A mathematical theory of communication", The Bell Systems Technical Journal, Vol. 27, pp. 379–423, 623–656, July-October, 1948.: ⇒ a reprint (with corrections) from Bell Labs

Joaquin Candela, Probability, Information Theory and Bayesian Inference, on videolectures.net: • part 1: probability; • parts 2 & 3: entropy and information; • part 4: Bayesian inference

Lectures 5 & 7. Source Coding: Practice

Witten, Neal & Cleary, "Arithmetic coding for data compression", Communications of the ACM, Vol. 30, No 6, pp. 520–540, 1987.

Cover & Thomas, Elements of Information Theory, Chapter 5: (see course folder, C127)

Lelewer & Hirschberg, "Data compression", ACM Computing Surveys, Vol. 19, No. 3, pp. 261–296, 1987.

Ken Huffman, "Huffman algorithm", Huffman Coding Blog, Dec. 24, 2008.: a mostly informal blog entry written by David Huffman's nephew &bull relates Huffman code to mobiles (no, not cell phones)

David Solomon, Data Compression: The Complete Reference, Springer, 2004, 926 pages: the title says it all: a comprehensive reference; more recent editions available

Lecture 6. Minimum Message Length Principle

Lloyd Allison, Minimum Message Length, web resource: brief introduction to basic ideas • many useful links

Lecture 8: Universal Source Coding

Peter Grünwald, The Minimum Description Length Principle, MIT Press, 2007, 504 pages: • part 2: universal coding

Lectures 9 & 10: MDL Principle

Peter Grünwald, The Minimum Description Length Principle, MIT Press, 2007, 504 pages: • chapter 1: learning, regularity, and compression
⇒ PDF available
Peter Grünwald, MDL Tutorial, on videolectures.net
Jorma Rissanen, Minimum Description Length Principle, Scholarpedia, 3(8):6727.
Kontkanen & Myllymäki, "A linear-time algorithm for computing the multinomial stochastic complexity", Information Processing Letters, Vol. 103, No. 6, pp. 227–233, 2007.: a simple yet very useful result • quite complex proof

Lecture 11: Further Topics

Li & Vitanyi, Kolmogorov Complexity and Its Applications, Springer, 1997.: the book about Kolmogorov complexity
John L. Kelly, Jr., "A new interpretation of information rate", Bell Systems Technical Journal, Vol. 35, pp. 917–926, 1956.: ⇒ PDF available
Video Coding Fundamentals & H.264, blog: • image coding
• video coding

September 8 First lecture (C222); participation obligatory.

September 9 Video of first lecture and first set of exercises available (see Contents below).

September 18 First exercises (C222).

September 25 We have a special guest lecturer, Dr. Daniel Schmidt, who talks about the Minimum Message Length (MML) principle. Note changed room: CK112.

October 20, 9-12AM Final exam, B123.

November 20, 16-20PM Separate exam in case you missed or failed the first one, A111.

University of Helsinki | Department of Computer Science | Helsinki Institute for Information Technology

582650 Information-Theoretic Modeling (4 cr) Department of Computer Science, University of Helsinki

Contents

Additional Material (by Lecture)

582650 Information-Theoretic Modeling (4 cr)
Department of Computer Science, University of Helsinki