Many models include groups of parameters that are somehow related or connected. This connection should be reflected in the prior chosen for them. Hierarchical models provide a useful tool for building priors for such groups. This is done by giving the parameters a common prior distribution which is parameterised with new higher level hyperparameters [16].
Such a group would typically include parameters that have a somehow similar status in the model. Hierarchical models are well suited for neural network related problems because such connected groups emerge naturally, like for example the different elements of a weight matrix.
The definitions of the components of the Bayesian nonlinear switching state-space model in Chapter 5 contain several examples of hierarchical priors.