Training a Deep Q-Network to Play Video Games

Playback speed

Share post at current time

Share from 0:00

0:00

Paid episode

The full episode is only available to paid subscribers of The AiEdge Newsletter

Training a Deep Q-Network to Play Video Games

Damien Benveniste

Feb 28, 2024

∙ Paid

What is Reinforcement Learning
The Bellman Equation
Deep Q-Networks
The Gymnasium Package
Implementing a Deep Q-Network to play Pong
Training on AWS

What is Reinforcement Learning

Reinforcement Learning considers all the possible paths and tries to find the path that maximizes the total reward. In the case of the following grid where the points are rewards:

Supervised Learning only considers the next step and would follow the path where each next step is maximized:

However, Reinforcement Learning considers the paths instead of just the next step:

The Bellman Equation

In an environment, we have states, actions, and rewards. From a state, we take an action, we end up in a new state, and we get a reward. We can value each state by following the recurring formula. The value of a state S is the maximum value that I can get by taking an action from the state, moving to a new state S’, getting the corresponding reward r_a, and the value V(S’) of that next state:

Listen to this episode with a 7-day free trial

Subscribe to The AiEdge Newsletter to listen to this post and get 7 days of free access to the full post archives.