Comments Page - The Explore vs. Exploit Dilemma

nmca 3 hours ago
A wonderful treatise on the same topic, “Reinforcement Learning Bit by Bit”, for anyone looking for a more advanced treatment of explore/exploit.
https://arxiv.org/abs/2103.04047
matheist 3 hours ago
See also Thompson sampling[+] for a different approach to multi-armed bandits that doesn't depend on explicitly distinguishing between explore-exploit.
[+] https://en.wikipedia.org/wiki/Thompson_sampling