Homework 2: reinforcement learning

Due Monday, 2009-02-23

Download these ten Python modules and one data file. You will create a new module, learner.py and will also make minor edits to thing.py. Submit all files that you create or edit.

How the program starts off

The program runs similarly, except for a few improvements suggested by various students and a few changes that make it possible to implement learning.

The main difference is in the Bunny class. If you use the settings in world, bunnies have two sensors, a 360-degree Touch sensor and a four-feeler Feel sensor. Each feeler feels what is at its end or any Wall that it passes through. To make the world relatively constant, kitties and veggies never die. Bunnies die when their strength goes to 0; they do not age.

There are three methods in the Critter class that you will need to understand.

Parameters

The parameter values given in world specify stable populations of kitties and veggies, a variable population of bunnies, and walls dividing the world. The reinforcement values that are relevant for this assignment are

Three parameters are relevant for Q-learning:

What you have to do

  1. Implement Q-learning in the bunnies. In addition to creating the module learner.py, you will have to do the following in thing.py
    1. Uncomment the import statement from learner import * at the top
    2. Uncomment the commented line in Bunny.set_learner(), which should replace pass in the method.
    3. Replace the pass below # Q-learning in step() with code that interacts with the learner module, completing the implementation of Q-learning.
    On a given time step, Q-learning works as follows in a critter.
  2. Show that Q-learning improves the performance of the bunnies. You can start with the parameters in world. If you adjust any of these, explain how and why.
  3. Explore the effects of varying each of these parameters:

Home

Calendar

Coursework

Notes

Code


IU | INFO | CSCI

Contact instructor