Average Reward, Continuing Tasks and Discounting

Prerequisites Intro to Linear Methods Semi-Gradient Prediction Semi-Gradient SARSA What is continuous? Let’s first describe the main task we will be handling; continuity. Continuous problems are tasks that has no specific terminal state, therefor will go on forever. As simple as it sounds, it is not a piece of cake to tackle the issues it brings with itself. Some examples could be the stock-market, where there is no end and you keep getting data....

March 11, 2020 · 7 min · 1321 words

Semi-Gradient Control Methods

Prerequisites Semi-Gradient Prediction Intro to Linear Methods If you read the prediction part for the semi gradient methods, it is pretty easy to extend what we know to the control case. We know that control is almost all the time just adding policy improvement over the prediction case. That’s exactly the case for us here for semi-gradient control methods as well. We already have describe and understood a formula back in prediction part (if you read it somewhere else that’s also fine), and now we want to extend our window a little....

March 10, 2020 · 9 min · 1733 words