A large study has compared the outcomes of children who've attended private schools to those who've attended public schools. A journalist summarized the report in the Washington Post. The study provides a nice example of how multivariate regression can be used to test third variable hypotheses.
When we look simply at the educational acheivement of students in private schools vs. public schools, private school students have higher achievement scores. However, all such studies are correlational, because the two variables--Type of School and Level of Achievement--are measured.
Therefore, such studies how covariance, because the results depict a relationship. The study may even show temporal precedence, because attending school presumably precedes the measure of achievement. However, such studies are weak on internal validity. We can think of several alternative explanations for why children in private schools are scoring higher.
One major alternative explanation is socioeconomic status. Children from wealthier families are more likely to afford private schools. And in general, children from wealthier families tend to score higher on achievement tests.
The Washington Post journalist quoted one of study's authors, Robert Pianta, who summed up the study's results this way:
“You only need to control for family income and there’s no advantage,” Pianta said in an interview. “So when you first look, without controlling for anything, the kids who go to private schools are far and away outperforming the public school kids. And as soon as you control for family income and parents’ education level, that difference is eliminated completely.”
Questions
a) Draw little diagrams similar to those in Figure 8.15 (in the 3rd ed.) to depict the arguments being made in this study. What would A be? What about B? In the quote from Pianta, above, what would the C variable(s) be?
b) The researchers used type of school (private vs. public), which is a categorical variable. But in some analyses, the researchers also used "number of years in private school" as an alternative version of this variable. Is "number of years in private school" categorical, ordinal, interval, or ratio data?
c) Sketch a mock-up regression table with the criterion variable at the top and predictors below (Use Table 9.1 as a model). Which variable do you think the researchers selected as the criterion (dependent) variable in their analyses? Which variable(s) would have been the predictors?
d) Now that you know what the results were, think about how the beta associated with "number of years in private school" would change when parental SES is added and removed from the regression analyses.
Original journal article seems to be open-access. Give it a try!
Suggested answers
a) A and B would be Type of School and Level of Acheivement. It doesn't really matter which one is A and which one is called B. C would be Family Income and/or Parental Education.
b) Ratio data (zero is meaningful in this scale because you could attend zero years of private school)
c) The criterion variable would be Achievement, and the predictors would be Number of Years of Private School, Family Income, and Parental Education.
d)When the Number of Years of Private School is on the table (in the analysis) by itself, its beta is likely to be positve and significant (more years of private school goes with higher achievement). When Family Income and Parental Education are added to the table, the beta for Number of Years of Private School should drop to zero. This pattern of results is consistent with the argument that Family Income and Parental Education are the alternative explanation for the original relationship .