A202 / I211 Assignment 2

Simple Statistics

Due Thursday, September 16th, 3:30 PM

In lab

We will be doing pair programming in this lab and assignment, and for much of the rest of this course. That means working in a team of two, submitting your work together, and working together on one computer. You will be required to switch partners for the first few pair assignments, and you may have the partner of your choice after that. Your partner must be in your lab section. If there are an odd number of students in your lab, some one person will not have a partner. Pick your partner for this assignment now! If you haven't done so already, take the time now to read the pair programming strategy note for this course.

Login to your computer. It doesn't matter whether you use your Network ID or your partner's. Open your CFS folder as in the last assignment and create a lab2 subdirectory for this lab exercise.

You are both urged to keep floppy-disk backups of your work. (If you have a USB memory key that works with the lab computers, that is even better than a floppy disc backup.) You are jointly responsible for submission of team work. Be sure you verify that the submission was satisfactory. If you leave it to other team member(s) to complete the assignment, and they mess up, you lose credit also!

  1. Open a new IDLE editor window and save the empty window right away to a file named lab2.py in your lab2 directory. Then in this window define a function sumNumbers that takes a positive integer n and returns the sum of the integers from 1 to n. Then press F5 and test it in the shell. For example
    >>> sumNumbers(4)
    10

    since 1 + 2 + 3 + 4 = 10.

  2. In the same file define a function sumCubes that takes a positive integer n and returns the sum of the cubes of the integers from 1 to n. For example
    >>> sumCubes(4)
    100

    since 13 + 23 + 33 + 43 = 100. This and the previous exercise are versions of chapter 3 exercises 11 and 12, modified to distill the essence of the problem to a function, instead of a full-blown application that prompts for input.

  3. Now add to the file a main method, and a call at the end of the file to invoke it, solves exercise 3.13 (exercise 13 of chapter 3). Of course also test it with F5.

When you have completed the last exercise above, or 15 minutes before the lab ends, whichever comes first, submit your lab2.py file as lab 2 in Vincent, by carefully following the instructions on the course web's Vincent page, linked in the contents panel on the left.

If you have finished the in-lab portion of an assignment before the end of a lab session you may leave, but you are strongly encouraged to stay and start work on the main portion of the assignment. That way you can get help right away if you need it, and you're already together with your partner for this assignment.

Assignment

Create a CFS directory named a2 for this assignment. (This won't be specified in the future, but it is a good idea to create a new directory named for each assignment you start work on.)

Read all of the course style rules and guidelines.

Write an application that prompts for the number of values to be entered, and then prompts for the given number of values. It then prints out

  1. the minimum of the values entered,
  2. the maximum of the values entered,
  3. the range of the values entered (the difference between the maximum and minimum values),
  4. the mean of the values entered, and
  5. the standard deviation of the values entered.

These statistics should all be printed as float point values, in the order indicated, on five separate lines, and be appropriately labeled.

The mean is the same as the average value, which may be expressed as (X)  / N, where N is the number of values, X is the sum of the values.

The standard deviation is the most common measure of how "spread out" a set of value is. It is the square root of the variance. The variance of n values may be calculated as

(X2 -  (X) 2 / N ) / (N - 1),

where N and X are as before, and X2 is the sum of the squares of the values.

Assume the largest and smallest possible float values are +/- 1.7976931348623157e+308.

For example:

Enter the number of values: 5
Enter value 1: 3
Enter value 2: 0
Enter value 3: 10.5
Enter value 4: 4.333
Enter value 5: 2
Minimum = 0.0
Maximum = 10.5
Range = 10.5
Mean = 3.9666
Standard deviation = 3.97980248254

Recall that unless an assignments says you should check for bad input, you may assume input values are sensible. Assignments will often include one or more examples, but just because your program gives the right answer in the given example(s) does not necessarily mean it is correct. Think about your programs carefully and test them thoroughly.

When you are done, submit your final statistics.py file as a2 using Vincent. As always, if you cannot finish all of the assignment, be sure to submit, before the due time, a version of your file that does as much as you can get working. You may include as comments, for partial credit, any code you wrote that does not work. Be sure that whatever you submit loads without error messages, and without doing any I/O until the main call at the very end is reached, or you will get very little credit.

Hints

Unless an assignment specifically says otherwise, you're always free to use any features of Python or the standard libraries that you like. But assignments are always designed so that you do not have to use any more of the language than has been presented in class and/or assigned reading. In particular, in this assignment you may be tempted to use arrays (or rather their Python equivalent, lists). You may do so, but it isn't necessary to do so. Though it is acceptable style to do so at this point, it is not the best style, since storing all the data is potentially memory inefficient. (Imagine several million numbers being entered.) There is another way, which is about as easy to program, and potentially more efficient, which does not require storing more than one input data element at a time.

Recall in the lab part of the assignment you used a single variable to accumulate the sum of all the data, without storing all the data. At any point that a new data value was about to be entered, this accumulator contained the sum of all the data seen so far (including just before the first data was entered, at which point the sum should be zero).

The preferred way to solve this assignment is to have four "accumulator" variables, from whose values all the statistics can be computed after the data has all been entered. One of these is the sum, as in the earlier exercise and another is the sum of squares needed to compute the variance. The other two variables in which information is accumulated as data is read are the minimum and maximum of the data values seen so far.