To: sbaker@odu.edu
From: Thomas Meyer
tmeyer@ph.vccs.edu
276 656-0283
Patrick Henry Community College
Subject: Statistics - w/ Dr.Spencer Baker - Homework Assignment #1, Ch 3 - Problem 08, page80.
Date: February 2, 2004
filename: StatHW1Ch3Prob08page80TomMeyer.doc
Given:
Suppose the raw frequency distribution of scores on a measure of generalized
anxiety is as follows:
55,95,43,
74,46,49,93,77,64,59,82,37,48,33,75,64,50,50,
72,64,74,75,50,43,60,75,40,65,79,54,52,55,55,
47,65,92,61,44,50,69,35,40,48,8,66,6,12,33,25,
38,26,40,42,31,26,25,34,33,88,30
(note 1: there are 60 reported scores) N=60
(note 2: the variable Generalized Anxiety Score will be reported on an Interval
scale)
![]()
a. Determine the mean, median, and mode from the raw data.
Answer:
from page 29:
1. Find the highest and lowest scores:
Highest = 95;
Lowest = 6.
2. Write down all possible scores from highest to lowest in decreasing
order in a column labeled X.
3. Go through the list of individuals, and make a tally mark next to the
score each person obtained.
4. Count the number of tallies for each score value and write that number
in a column labeled f.
5. Starting at the bottom of the table, fill in the cumulative frequency.
|
Table of Ch3, Problem 8 - Raw Frequency Distribution of
Scores Reported
Frequency of X
Cumulative Frequency of X |
| 95 | 1 | 60 |
| 93 | 1 | 59 |
| 92 | 1 | 58 |
| 88 | 1 | 57 |
| 82 | 1 | 56 |
| 79 | 1 | 55 |
| 77 | 1 | 54 |
| 75 | 3 | 53 |
| 74 | 2 | 50 |
| 72 | 1 | 48 |
| 69 | 1 | 47 |
| 66 | 1 | 46 |
| 65 | 2 | 45 |
| 64 | 3 | 43 |
| 61 | 1 | 40 |
| 60 | 1 | 39 |
| 59 | 1 | 38 |
| 55 | 3 | 37 |
| 54 | 1 | 34 |
| 52 | 1 | 33 |
| 50 =mode | 4 (most
occurring frequency) (4 is the value from the f column in the row containing the median) |
32 This is the row containing the median score if odd, or two centered scores if even |
| 49 | 1 | 28
(The 30th & 31st scores will lie in the horizontal row above, and will both be "50") (28 is the cf value in the row below that row containing the median) |
| 48 | 2 | 27 |
| 47 | 1 | 25 |
| 46 | 1 | 24 |
| 44 | 1 | 23 |
| 43 | 2 | 22 |
| 42 | 1 | 20 |
| 40 | 3 | 19 |
| 38 | 1 | 16 |
| 37 | 1 | 15 |
| 35 | 1 | 14 |
| 34 | 1 | 13 |
| 33 | 3 | 12 |
| 31 | 1 | 9 |
| 30 | 1 | 8 |
| 26 | 2 | 7 |
| 25 | 2 | 5 |
| 12 | 1 | 3 |
| 8 | 1 | 2 |
| 6 | 1 | 1 |
The Mode is the most frequently occurring score in a distribution (and
used most often when the data are reported on a nominal scale.)
The Mode is:
The Mode is:
1. not representative in that only observations at that value count
1. Representative
2. It may not be unique (although it is in this example).
(in the sense that
3. Unstable from one group to another
more observations
4. Cannot be used in further analysis
occurred at that value; and
2. Easy to obtain.
| The mode in this raw frequency distribution is 50. |
The Median is:
- the point which has 50% of the distribution below it.
- the point that cuts the distribution exactly in half.
For discrete scales:
The median can be obtained by counting off cases from the frequency
distribution.
With an odd number of scores, the median is the score received by the middle
person.
With an even number of scores, the median is the average of the scores received
by the two middle persons.
| If a discrete scale were in use, the median for this example is the average of the 30th and 31st scores, or (50+50)/2 =50. |
For continuous scales (such as in this problem):
The median is the value on the score scale that cuts the distribution exactly in
half.
The lower real limit of 50 is 49.5 (the
upper real limit of 50 is 50.5).
Number of reported scores is N=50
So the score scale is from 5.5 to 95.5
(Consider that rounding will include values greater than 5.5 and less than
95.5.)
(Consider too that the lower score limit of 50 will be 49.5 and the upper score
limit of 50 will be 50.5.)
Formula 3.1: The median = lower limit on the score scale +
{(1/2 number of scores N - the cf value in the
row below that row containing the median) /
the value from the f column
in the row containing the median}
or in number values, the median = 49.5 + { [(1/2) (60) - 28] / 4
= 49.5 + {(30 - 28) / 4}
= 49.5 + 2/4
| For a continuous scale the median = 50 |
The mean is a numerator divided by a denominator.
The numerator of the mean is the sum of the products of each score or
X-value in the table above multiplied by each frequency.
Thus, starting at the bottom of the table 6X1+ 8X1 + 12X1 + 25X2 + 26X2
+30X1 ... = 3121
The denominator of the mean is the number of scores, or N or 60.
| So the mean is 3121/60 = 52.02 |
![]()
b. Prepare a grouped frequency distribution using about 10 intervals.
from text page 32
1. Have each interval contain an odd number of score values.
2. Use between 10 and 20 intervals.
3. Make all intervals the same width.
4. Intervals should not contain impossible values.
5. Make the interval midpoints
or one of the score limits
a multiple of the interval width.
To do this, subtract the lowest x-value from the highest x-value or 95-6 = 89.
Divide the 89 by 10 and by 20.
89/10 = 8.9 intervals (too small)
Use 11 (it lies between 10 and 11)
89/11 = approx. 8 as an interval width.
Where to start the interval and how to determine its midpoint remain a
problem in my mind!!
| Interval | Midpoint | f | cf |
| 89-97 | 94 | 3 | 60 |
| 80-88 | 85 | 2 | 57 |
| 71-79 | 76 | 8 | 55 |
| 63-71 | 67 | 7 | 47 |
| 54-62 | 58 | 7 | 40 |
| 45-53 | 49 | 10 | 33 |
| 36-44 | 40 | 9 | 23 |
| 27-35 | 31 | 7 | 14 |
| 18-26 | 22 | 4 | 7 |
| 9-17 | 13 | 1 | 3 |
| 0-8 | 4 | 2 | 2 |
![]()
c. Find the mean, median, and mode for the grouped data in Part b above.
The mode is the midpoint of the interval with the most scores, so the yellow row above containing 10 scores contains the mode.
| The midpoint of the interval with the most scores is 49; so the mode is 49. |
Use the following data to compute the median on grouped data:
The lower limit of the row containing the mode is 44.5
N=60; 1/2 N = 30
The cf in the row below the mode is 23
The f in the mode row is 10.
Every interval, such as 9-17 has a score interval from 8.5 to 17.5, so the score
interval is 9
Plugging these into formula 3.1 for the median yields:
Median = 44.5 plus ___________
where ___________ = (1/2N - The cf in the row below the mode) / (The f in the
mode row ) X the width of the score interval
so ___________ = (30 -23)/10 X the width of the score interval
so ___________ = (7/10) X 9 = 63/10 = 6.3
| so the median = 44.5 + 6.3 = 50.8 |
Use the following data to compute the mean on grouped data:
The mean is a numerator divided by a denominator.
The numerator is the sum of each of the frequencies (f) multiplied by its
respective midpoint in the table above.
Thus , starting at the bottom of the table, 2X4 + 1X13 + 4 X 22 + 7X31 + 9X40 ... = 3111.
The denominator is the sum of the frequencies, or N or 60.
| so the mean of the grouped data is 3111/60 = 51.85 |
![]()
d. Prepare a grouped frequency distribution using about 20 intervals.
from text page 32
1. Have each interval contain an odd number of score values.
2. Use between 10 and 20 intervals.
3. Make all intervals the same width.
4. Intervals should not contain impossible values.
5. Make the interval midpoints
or one of the score limits
a multiple of the interval width.
To do this, subtract the lowest x-value from the highest x-value or 95-6 = 89.
Divide the 89 by 20.
89/20 = approximate 4 scores per interval (but we want an odd rather
than an even number of scores per interval.)Therefore divide 89 by 19 = approx.
5 scores per interva.
Use 19 intervals (it lies between 10 and 20)
89/19 = approx. 4.94 as an interval width.
Using a lower score limit equal to the interval width we can start with the 5
-9 which is an interval width of 4.5 - 9.5 = 5
Where to start the interval and how to determine its midpoint remain a
problem in my mind!!
| Interval | Midpoint | f | cf |
| 95-99 | 97 | 1 | 60 |
| 90-94 | 92 | 2 | 59 |
| 85-89 | 87 | 1 | 57 |
| 80-84 | 82 | 1 | 56 |
| 75-79 | 77 | 5 | 55 |
| 70-74 | 72 | 3 | 50 |
| 65-69 | 67 | 4 | 47 |
| 60-64 | 62 | 5 | 43 |
| 55-59 | 57 | 4 | 38 |
| 50-54 | 52 | 6 | 34 |
| 45-49 | 47 | 5 | 28 |
| 40-44 | 42 | 7 | 23 |
| 35-39 | 37 | 3 | 16 |
| 30-34 | 32 | 6 | 13 |
| 25-29 | 27 | 4 | 7 |
| 20-24 | 22 | 0 | 3 |
| 15-19 | 17 | 0 | 3 |
| 10-14 | 12 | 1 | 3 |
| 5-9 | 7 | 2 | 2 |
e. Find the mean, median, and mode for the grouped data in Part d above.
The mode is the midpoint of the interval with the most scores, so the yellow row above containing 7 scores contains the mode.
| The midpoint of the interval with the most scores is 42; so the mode is 42. |
Use the following data to compute the median on grouped data:
The lower limit of the row containing the 30th and 31st scores is in green, and
its lower limit is 49.5
N=60; 1/2 N = 30
The cf in the row below the green row is 28
The f in the green row is 6.
Every interval, such as 5-9 has a score interval from 5.5 to 9.5, so the score
interval is 5
Plugging these into formula 3.1 for the median yields:
Median = 49.5 plus ___________
where ___________ = (1/2N - The cf in the row below the mode) / (The f in the
mode row ) X the width of the score interval
so ___________ = (30 -28)/6 X the width of the score interval
so ___________ = (2/6) X 5 = 10/6 = 1.667
| so the median = 49.5 + 1.667 = 51.7 |
Use the following data to compute the mean on grouped data:
The mean is a numerator divided by a denominator.
The numerator is the sum of each of the frequencies (f) multiplied by its
respective midpoint in the table above.
Thus , starting at the bottom of the table, 2X7 + 1X12 + 0 X17 + 0X22 + 4X27 ... = 3135.
The denominator is the sum of the frequencies, or N or 60.
| so the mean of the grouped data is 3135/60 = 52.25 |
![]()
f. What differences exist in the mean, mode, and median for parts a, c, and e? Account for any differences that exist in each measure.
In the raw method, the data is not affected by interval size
In the grouped methods, the data have been put into intervals of different
sizes.
The smaller the interval width, the closer the mean for grouped data should
approximate the mean for raw data.
Tom Meyer
Thomas Meyer