Inclusion of the effects of noise into the model of a time series leads to the world of statistics -- it is no longer possible to talk about exact events, only their probabilities.
The Bayesian framework offers the mathematically soundest basis for doing statistical work. In this chapter, a brief review of the most important results and tools of the field is presented.
Section 3.1 concentrates on the basic ideas of Bayesian statistics. Unfortunately, exact application of those methods is usually not possible. Therefore Section 3.2 discusses some practical approximation methods that allow getting reasonably good results with limited computational resources. The learning algorithms presented in this work are based on the approximation method called ensemble learning, which is presented in Section 3.3.
This chapter contains many formulas involving probabilities. The
notation is used for both probability of a discrete event
and the value of the probability density function (pdf) of a
continuous variable at
, depending on what
is. All the
theoretical results presented apply equally to both cases, at least
when integration over a discrete variable is interpreted in the
Lebesgue sense as summation.
Some authors use subscripts to separate different pdfs but here they
are omitted to simplify the notation. All pdfs are identified only by
the argument of .
Two important probability distributions, the Gaussian or normal
distribution and the Dirichlet distribution are presented in
Appendix A. The notation
is used to denote that
is normally distributed with
mean
and variance
.