Graphical Models

Project

Please note in selecting a project task that not all tasks are equal in difficulty. Clearly, a moderate effort on a difficult task should be graded the same as a good effort on an easy task. Thus everything is relative. Moreover, if you want to suggest some extensions to one of the three set projects, then that would be a suitable "Own" project.

You should deliver an implementation that works on standard Linux workstations such as the ones in the Department so that we can test them. You can use either C, C++ or Java as programming language; if you insist we may accept others (Perl?), just ask us first. You should also deliver a short documentation explaining the ideas behind your project work. Emphasize the mathematical issues more that software design; we do not want to see JavaDoc files. Also, comment the tricky/essential parts of your code.

Additional points are given for theoretical and empirical analysis of the implementation; computational complexity (time/space); effect of factors such as number of variables, sample size, density of edges, etc. If your algorithm is probabilistic, how good is it on the average/with high probability?

Learning

Implement learning. Input data in the ".dat" form form for the project template. Output one or more network files, each in the ".bnet" form of the project template. This should conform to formats because we will test it automatically with the data generator. If several networks are output, then also give their relative probabilities in a separate output. Selection of network(s) can be done by any method as long as it is documented.

Inference A: Maximum probability configuration

Implement inference of maximum probability configuration. Assume a given network over a fixed set of discrete variables X, specifying p(X). The distributions (probability tables) for the network are simple tables as specified in the project template ".bnet" format below. Fixing some subset of variables F⊂X to values V (both F and V are program inputs), find an assignment of values for the remaining variables that maximizes p(X|F=V). The fixed variables are given as a one-row data set with free variables left empty. The probability achieved at the maximum is not needed. Documentation should also be produced explaining the method clearly but simply.

Inference B: Marginals

Implement inference of marginals. Assume a given network over a fixed set of discrete variables X, specifying p(X). The distributions (probability tables) for the network are simple tables as specified in the project template ".bnet" format below. Fixing some subset of variables F⊂X to values V (both F and V are program inputs), find marginal probabilities for the remaining variables p(x|F=V) for x∈X, x∉F. The fixed variables are given as a one-row data set with free variables left empty. Documentation should also be produced explaining the method clearly but simply.

Own Project

Suggest your own programming or programming+theory project around the topic of the course.


Download the project template. (Once downloaded open in Linux with 'tar -xzvf project-template.tgz'.) It contains a README file with some hints on where to start.

 

 Graphical Models
2003