Let us denote the observed data with
,
the hidden state values with
and all
the other model parameters with
. These other parameters
consist of the weights and biases of the MLP networks and
hyperparameters defining the prior distributions of other parameters.