⁵Next, I give a more rigorous and general definition of state-values. To begin with, it must be noted that the state-value is a function of the “policy” used by the agent; the policy is what I call the action selection system in the text, i.e., the system that decides which action is taken in any given state (where the decisions can have some randomness programmed in them). Further, we have to take into account the fact that the world may have some randomness in it, so we have to consider expected reward in the sense of the mathematical expectation in probability theory. The value function at a given state is then generally defined as the expected amount of discounted reward that the agent will obtain starting from that state, when it follows that policy. (Sometimes, when speaking about state values, it is more specifically assumed that the policy in question is the optimal policy which gives the highest expected reward, nut that is just one special case for a special policy.) This definition reduces to the definition in the main text for the case of a single goal in a deterministic world, where the state-value is a decreasing function of the distance to the goal. The connection can be seen by defining that there is a reward at the goal and nowhere else, and using the fact that there is discounting, and thus rewards in the distant future are given less weight than rewards in the near future. Then, the closer you are to the goal, the larger the expected reward is, because the reward at the goal is given more weight when you are closer to the goal. (I define here “closer” to mean that you can get there more quickly compared to the situation where you are further away and need time to get there). While this standard definition in the literature, as just given, considers the reward uncertain and talks about expected (discounted) reward, I will not usually do that in this chapter for simplicity: I assume the world, as well as the policy, are deterministic. See Chapter 7 and its footnote 11 for a more sophisticated, probabilistic definition.