We will now examine an analytic variable (RDABIL) in the Literacy work file that holds the IRT reading ability score of each respondent. In the previous exercises, the dependent variable was a categorical variable indicating four levels of reading ability. This classification was based on cut points along the continuum of raw reading ability scores, that is, the scores in RDABIL. The actual raw scores provide more information than the categorical representation of reading ability. Rather than counting the number of respondents in each category, research can now focus on attributes of the raw performance scores. In particular, we are initially interested in the typical value or values along the scale of reading ability and also the spread or variance of these scores. Examining typical values and the spread of the distribution permits a summarization of this measurement of reading ability.
| Total Weighted N = | . |
| How does this weighted N compare with the weighted N in the previous exercises? | . |
| Number of Missing Cases = | . |
Minimum Value = | . |
Confidence interval 95% min. = Mean = Confidence interval 95% max. = . | . |
| Maximum Value = . | . |
Range of the distribution = Standard Deviation = . |
. |
These summary figures provide two numeric measures of spread, namely, the range and standard deviation, and a "typical value" for the distribution, namely, a mean (also known as the average). Notice that SPSS also reports a 95% confidence interval about the mean.
| 25th Percentile (1st Quartile) = | . |
| 50th Percentile (Median) = | . |
| 75th Percentile (3rd Quartile) = | . |
Inter-quartile range = | . |
The inter-quartile range is reported in the descriptive statistics and can also be calculated quickly by subtracting the 1st quartile value from the 3rd quartile value. The inter-quartile contains the middle 50% of the data, with another 25% above the third quartile and the remaining 25% below the first quartile. This provides another numeric summary of the spread for this distribution.
| Compare the mean and median for this distribution. Which is larger, the mean or the median? | . |
When the distribution is symmetric, the mean and median are identical. However, with asymmetric distributions, the mean is influenced by the longer tail of the distribution, that is, the mean is pulled toward the longer tail. This is more easily understood by examining a graphical representation of the distribution. The next set of commands will produce a histogram, that is, a graph, of the IRT reading ability scores.
The output from the histogram command needs some modification. Follow the next set of steps to improve this display.
From the menu bar, select Chart and then Inner Frame, which should turn this feature off (i.e., we don't want an inner frame displayed.)
Upon completing the above steps, the histogram on your monitor should appear as the graph shown below.
| How many primary spikes or peaks exist to the right of the score 150? | . |
| What explanations might exist for these spikes? | . |
| Are there any gaps in this distribution? | . |
| Where does the largest gap occur? | . |
| Is the mean located at the top of a peak in the distribution? | . |
| Is the median located at the top a peak in the distribution? | . |
| Both the mean and medians are used as "typical values" for a distribution. How well do you feel they represent the distribution of IRT reading ability scores? | . |
| The theoretical curve for a normal distribution with a mean of 260 and a standard deviation of 43.8 have been superimposed on this histogram.
Where does the mean intersect the curve representing a normal distribution? | . |
| How well does the normal curve fit the distribution of the IRT reading ability scores? | . |
| Looking at the frequency polygon for the IRT reading ability scores, what summary statements can be made? | . |