CSCI A113
Lecture Notes Three

Fall 2001


Measures of Central Tendency and Variability

Of all the answers to the minute paper of last time I have selected two, which I found the most adequate. There were several more that were correct, so if yours was not selected that does not mean it was not good. Here are the answers I selected:

Last time we looked at some measures of central tendency.

The homework is asking you to compare them.

Here's the minute paper for today:

The deVoe Report (June 2, 1980) quoted then U.S. President Jimmy Carter as saying "half the people in this country are living below the median income -- and this is intolerable." What is disputable and what is true in this quote?
Turn your answers in on paper at the end of the lecture.

This will be a more mathematical lecture.

1. The Arithmetical Mean.

The arithmetical mean is defined as the sum of the scores divided by the number of scores.

In equation form that is

Properties of the mean:
  1. The mean is sensitive to the exact value of all the scores in the distribution.

  2. The sum of the deviations about the mean equals zero.
  3. The mean is very sensitive to extreme scores.

  4. The sum of the squared deviations of all the scores about their mean is a minimum.

    In other words, this formula (in which zeta is an unknown)

    admits a minimum when zeta has this value
    We need to verify that.

  5. Under most circumstances, of the measures used for central tendency, the mean is least subject to sampling variation. If we were repeatedly to take samples from a population on a random basis, the mean would vary from sample to sample. The same is true for the median and the mode. However, the mean varies less than these other measures of central tendency. This is very important in inferential statistics, and is a major reason why the mean is used in inferential statistics whenever possible.
2. The Median.

The median is defined as the scale value below which 50% of the scores fall. It is therefore the same thing as P50.

Properties of the median:

  1. The median is less sensitive than the mean to extreme scores.

  2. Under usual circumstances, the median is more subject to sampling variability than the mean but less subject to sampling variability than the mode.

3. The Mode.

The mode is defined as the most frequent score in the distribution.

Usually distributions are unimodal. When a distribution has two modes it is bimodal.

MEASURES OF VARIABILITY

1. The Range.

The range is defined as the difference between the highest and lowest score in the distribution.

2. Deviation Scores.

A deviation score tells how far away the raw score is from the mean of its distribution.

3. The Standard Deviation.

For a population of scores we have:

For a sample we have:
Alternative formula for the standard deviation:
Properties of the standard deviation:

  1. The standard deviation gives us a measure of dispersion relative to the mean. This differs from the range, which tells us directly the spread of the two most extreme scores.

  2. Like the mean, the standard deviation is sensitive to each score in the distribution. If a score is moved closer to the mean, then the standard deviation will become smaller. If a score shifts away from the mean, then the standard deviation will increase.

  3. Like the mean, the standard deviation is stable with regard to sampling fluctuations.

  4. Both the mean and the standard deviation can be manipulated algebraically. This is an important aspect, as it allows mathematics to be done with them for use in inferential statistics.

Lab notes for tomorrow will be posted early tomorrow morning.


Last updated: October 29, 2001 by Adrian German for A113