TD(λ)

TD(λ): Q-learning

Perceptual aliasing