Please note in selecting a project task that not all tasks are equal
in difficulty. Clearly, a moderate effort on a difficult task should
be graded the same as a good effort on an easy task. Thus everything
is relative. Moreover, if you want to suggest some extensions to one
of the three set projects, then that would be a suitable "Own"
project.
You should deliver an implementation that works on
standard Linux workstations such as the ones in the Department so that
we can test them. You can use either C, C++ or Java as programming
language; if you insist we may accept others (Perl?), just ask us
first. You should also deliver a short documentation
explaining the ideas behind your project work. Emphasize the
mathematical issues more that software design; we do not want to see
JavaDoc files. Also, comment the tricky/essential parts of your
code.
Additional points are given for theoretical and empirical analysis
of the implementation; computational complexity (time/space);
effect of factors such as number of variables, sample size,
density of edges, etc. If your algorithm is probabilistic, how good
is it on the average/with high probability?
Implement learning. Input data in the ".dat" form form for the project template. Output one or more network files, each in the ".bnet" form of the project template. This should conform to formats because we will test it automatically with the data generator. If several networks are output, then also give their relative probabilities in a separate output. Selection of network(s) can be done by any method as long as it is documented.
Implement inference of maximum probability configuration. Assume a given network over a fixed set of discrete variables X, specifying p(X). The distributions (probability tables) for the network are simple tables as specified in the project template ".bnet" format below. Fixing some subset of variables F⊂X to values V (both F and V are program inputs), find an assignment of values for the remaining variables that maximizes p(X|F=V). The fixed variables are given as a one-row data set with free variables left empty. The probability achieved at the maximum is not needed. Documentation should also be produced explaining the method clearly but simply.
Implement inference of marginals. Assume a given network over a fixed set of discrete variables X, specifying p(X). The distributions (probability tables) for the network are simple tables as specified in the project template ".bnet" format below. Fixing some subset of variables F⊂X to values V (both F and V are program inputs), find marginal probabilities for the remaining variables p(x|F=V) for x∈X, x∉F. The fixed variables are given as a one-row data set with free variables left empty. Documentation should also be produced explaining the method clearly but simply.
Suggest your own programming or programming+theory project around the topic of the course.
Download the project template. (Once downloaded open in Linux with 'tar -xzvf project-template.tgz'.) It contains a README file with some hints on where to start.
Graphical Models |