10To keep the exposition simple, I’m taking some shortcuts here. It must be emphasized that the reward loss is here computed for the total (discounted) future reward, instead of any particular future time point. Thus, more precisely, threat is a prediction that the total future reward has a sufficiently large probability of being much less than the expected total future reward, with discounting applied if necessary; see footnote 11 below for a mathematical definition. Obviously, it is necessary to define hyperparameters that say what is “sufficient” and “much less” (or “large” in the definition of the main text). Alternatively, it is also possible to define threat simply on the distribution of reward loss at a single time point, which would lead to simpler computation at the risk of suboptimality.