Course Material

Lectures

Please note that while lecture notes will be provided, the lecture itself will sometimes digress to web-based material and Java applets for demonstrations. These will usually be listed on this page, but not on the PDF slides.

Lecture 1: Introduction: Graphs and Probability

Review of some famous graphical models including circuits, fault trees, and ising lattices. Review of graphs including definitions for directed and undirected graphs, relations such as parents and ancestors, and connectivity notions. Review of probability as frequency or belief in both discrete and continuous contexts. Review of modelling as a key part of intelligent systems design. Examples of using graphs to model dependence, causality and relevance. [ Lecture 1 in PDF, in Postscript, 2-up, gzipped ]

Preparatory reading: Please review the material under graphs from Graph Theory and the Ising Model, and course notes of Three Concepts: Probability (PDF).

Lecture 2: Independence Modelling

Functional independence in optimization as a method for task simplification. Graph partitioning as a means of problem decomposition. Probabilistic independence as functional independence in probability space. Undirected graphs, relevance, and independence defined as set separation. Directed graphs, dependence, and independence defined on the underlying undirected graph. Soundness and completeness between graphical independence and functional independence forms. A catalogue of graphical forms for time-series, spatial modelling, diagnosis and fault detection and their independence properties. [Lecture 2 in PDF , in Postscript, 2-up, gzipped ]

Preparatory reading: Please review Shachter's manuscript "Examples without Numbers" (handed out in Lecture 1) and Scheines' tutorial on d-separation. You don't have to understand d-separation, because its difficult and we'll use a Lauritzen-style definition instead, but have a look at it. Part I of Bishop's tutorial (see below) is good material for this lecture.

Lecture 3: Exact Inference

Given a graphical model on discrete variables of low dimension, and a probability formula, evaluate the probability, or find the maximum likelihood assignment for the free variables. We will consider this in a general sense. We will develop an exact algorithm for the computation as nested summation on a clustered graph using variable elimination as the basic step. This turns out to create a clique tree, the general independence structure. Thus we will consider finding good clique trees, and their complexity. We will not cover implementation details for the algorithm used to speed it up in practice. [Notes in PDF (updated since lecture!!), in Postscript, 2-up, gzipped]

Variable elimination demo (not currently available).

Preparatory reading: Please review R. McEliece and S. M. Aji, 2000. The Generalized Distributive Law, IEEE Trans. Inform. Theory, vol. 46, no. 2 (March 2000), pp. 32 5--343. Read sections I, II and VI only.

Lecture 4: Introduction to Modelling and Learning

We begin by viewing B-Course applied to the "Visit to Asia" network. Learning will look at simple cases: learning proportions, and learning a regression curve. These examples are covered in detail because they form the basis of many algorithms. The proportions example will be done using the excellent Beta Coin Experiment applet from the Prob./Stats Object Library. They introduce the notion of sufficient statistics and the trade-off between complexity and predictive accuracy, and they illustrate the problems with priors. Discussion of modelling looks at the interaction between design decisions and the quality of the developed/proposed solution. [Notes in PDF (updated since lecture!!), in Postscript, 2-up, gzipped]

Preparatory reading: Please review Henry Tirri's notes on proportions. Heckerman's tutorial is a reference for this section.

Lecture 5: Learning with Graphs

Learning of graphs requires one to learn both the graphical structure and the probability parameters at each node. Probabilistic reasoning can handle both, given suitable assumptions. Learning on graphs will look at methods in the B-Course system. We will look at the maximum likelihood approach and some priors used for the Bayesian approach. The search space will be briefly discussed. The key techniques used here will be considered: parameter independence, the exponential family, the notion of "evidence" and its use, and multiple models. [Notes in PDF (updated since lecture!!), in Postscript, 2-up, gzipped

Preparatory reading: Please look at the B-Course Library. You could optionally review initial part of Friedman and Goldszmidt's tutorial, not beyond first 50 slides (first 20 is review). Have a practice on B-Course with some example data generated using Teemu's generator.

Lecture 6: Selected Applications

Applications will be presented by Tomi Silander on learning with B-Course, on new PCA models in information retrieval by Wray Buntine, and on location sensing by Petri Myllymäki.

Petri's slides: ``Graphical Models on Manhattan: A Probabilistic Approach to Mobile Device Positioning''

Preparatory reading: None for this week.

Literature, further reading

There is no need to purchase any books. We will supply copies of any necessary material prior to lectures. Material available online will sometimes be in Adobe's PDF or Postscript format so you should be able to view these formats. Moreover, some of the demonstrations require Java capable browsers. Other than the use of these long-time standard formats, we make efforts to avoid online material that requires special browsers (e.g., thus material converted from Powerpoint is avoided). Note when we say something is a reference book, we mean you should not use it to learn the material, but only to check details or find more advanced material.

Graphs

Graph Theory from Math World including graph, directed acyclic graph, forest, tree, degree, outdegree and indegree, clique, induced subgraph, and connected component. Some additional material we will touch on and you probably wont know is cut point, mincut, minimum spanning tree, random graph, and phase transition. What is missing strangely is the notion of width and parent or ancestor relations.
Ising Model (1) Ising Model (2) is an undirected graph in a grid where all points are connected to their horizontal and vertical neighbors. It works like a probabilistic version of the game of life.
“Why is Automatic Test Pattern Generation (ATPG) easy?” (PDF, 627kB) subject is nothing to do with this course, but the paper illustrates the notions of tree-width using a heuristic proxy called "cut-width" (a safe upper bound) to tree-width we define later. The hardness of problems in practice is shown as a function of this in a real-world context. Skim read this; you aren't supposed to understand the details.
- A local copy (PDF)
“A tourist guide through treewidth” (PDF, 1311kB), by Bodlaender. Accessible but much more detail than we need. Yet shows you the breadth and use of the computational/optimization side of graphical modelling. Probabilistic graphs get a mention under the old tag of “expert systems”.
- A local copy

Probability

Three Concepts: Probability Course Material, especially the first four lectures, here in PDF: concepts, modelling, probability and proportions. Further links on probability can be found at the Three Concepts: Probability site.
For reference only a variety of useful links and tutorials at Tom Minka's home page at CMU including A Statistical Learning/Pattern Recognition Glossary and the tutorials section of Statistical Learning papers.
A good review of probability from a standard distributional perspective can be got from the excellent set of applets at the Prob./Stats Object Library. For instance, our proportions example will be done using the Beta Coin Experiment applet.

Graphical Models

“Probabilistic Graphical Models Part One: Graphs and Markov Properties ” (PDF, 890kB), and “Probabilistic Graphical Models Part Two: Inference and Learning ” (PDF, 1394kB), by C.M. Bishop. Good tutorial overlapping quite a lot of the course.
- A local copy of Part I
- A local copy of Part II
Heckerman's tutorials, especially “A Tutorial on Learning with Bayesian Networks” and “Technical overview for statistician (includes undirected graphs) - slides from Valencia 2002 tutorial”. Used in the learning lectures.
- A local copy of "A Tutorial..."
- A local copy of "Technical overview..."
Friedman's AAAI-98 tutorial with Goldszmidt, covers different material to his NIPS-2001 tutorial, though the NIPS-2001 tutorial is more recent.
A Brief Introduction to Graphical Models and Bayesian Networks, Kevin Muphy's tutorial from 1998.
B-Course developed in the department. Used in the learning lectures.
Learning in Graphical Models edited by M.I. Jordan, 1999 (available in Computer Science Department library). An advanced reference book and a good source of term paper material. Individual chapters also available on-line.
M.I. Jordan, “Graphical Models” (PDF, 1239kB), Statistical Science, 2003.
- A local copy
Graphical Models by S.L. Lauritzen, 1996 (available in Computer Science Department library). An advanced reference book. Lauritzen has been instrumental in a lot of early developments.
R. McEliece and S. M. Aji, 2000. The Generalized Distributive Law, IEEE Trans. Inform. Theory, vol. 46, no. 2 (March 2000), pp. 325--343. Overview and reference for exact computation.
Graphical Models in Applied Multivariate Statistics by Whitaker, 1990 (available in Valtiotieteellisen tiedekunnan kirjasto). A good reference book from the statistical side, intended as an advanced text for graduates.
“A catalogue of graphs” (PDF, 1318kB), by Studeny. Ignore the first 8 pages of theory, the rest catalogues different types of directed and undirected graphs over 2-5 variables. In general, Studeny is an excellent source of theoretical work on independence models, probability and graphs.
- A local copy
Graphical Models Discussion Papers edited by Joel Chenu, another reference, collection of research-oriented tutorials.
Cecil Huang and Adnan Darwiche. Inference in belief networks: A procedural guide. (PDF, 451kB) International Journal of Approximate Reasoning, 15(3):225-263, October, 1996.
- A local copy

Toolboxes/Software




Graphical Models	2003