18(Minut and Mahadevan, 2001; Mnih et al., 2014; Greff et al., 2016). Learning attention has recently become very fashionable in AI, especially in large language models due to the methods proposed by Vaswani et al. (2017), although their use of the term is quite liberal and has only a vague resemblance to what we are discussing here.