⁵To this advantage we have to add the more technical one that stochastic methods include an implicit regularization and are thus less likely to overfit the data (Bottou, 2003; Hardt et al., 2016). Overfitting is an important problem in practical AI learning, but I don’t discuss it at any length in this book. Basically, it means that if the amount of data at your disposal is very limited, learning may go wrong in a particular way: The learning may seem to work well for the data you have, achieving a good “fit”, but the predictions your neural network gives are actually useless, because the learning “overfit” your data and does not work (or give a good fit) for any new data on which you would like to apply the system in the future.