Commentary

Critique of CAP Report Card Fires Blanks

Report card still valuable despite list of complaints from MSU professor

The National Education Policy Center took aim at the Mackinac Center's latest Context and Performance Report Card with a review by Michigan State University Professor John Yun.

In January, the Center released the fourth edition of the high school CAP Report Card and the seventh edition overall (including three elementary and middle school report cards). The CAP reports' methods and approach, entirely reliant on publicly available state data, is unique in that it takes into consideration the socioeconomic differences of the schools it grades. Other organizations in Michigan have copied this approach and produced similar report cards, and numerous school officials have praised the product over the years.

Yun's NEPC review, which credits the goal of the report as laudable and describes its presentation as being in "easily readable form," is nevertheless extremely critical of the report card. Yun writes that it should be “given no weight in any discussion of policy or practice” and ultimately “does a disservice.” The review highlights many of the limitations of the state data the report card relies on and in the process sets a very high standard for meaningfully measuring school performance. If researchers were to adhere to the standards Yun implies must be met in order to assess schools, it would be fair to ask if there are any credible means of measuring school performance.

The CAP Report Cards rely heavily on two available state data sets: school-level average scale scores from state standardized tests and reported student eligibility for the federal school lunch program.

The NEPC review assails both data sets. Yun says the Michigan Department of Education's decision not to release the 2018 M-STEP science test because of concerns about its ability to accurately measure achievement creates a "red flag" for using any M-STEP test results. The state's shifting testing regime, Yun writes, also creates a problem. Because our analysis showed a strong, but not perfect, fit between the Michigan Merit Examination test results and the M-STEP and SAT test results, the review concludes that they "may not be suitable substitutes for one another." This critique could be hypothetically true no matter how strong the statistical fit, so this comes down to a simple judgment call. We think the strong statistical relationship between the MME and M-STEP tests makes it acceptable to use both in ranking schools.

Similarly, Yun dismissed the report card's comparison of a school’s change in CAP Scores over time, because the tests used to generate these scores were not identical. If we were limited to using only identical tests to assess school performance over time, we’d be left with nothing to do, seeing as state bureaucrats and politicians constantly tweak these tests. Further, it’s not clear to us why these underlying tests scores would need to be equal when they are not what is actually being compared, and given that the annual CAP Scores are normalized and are always relative to other schools in a given year.

The CAP Report Card did not try to hide the fact that it averages these different test scores together. It alerts readers to this fact and admits the limitations of this approach. We do not purport that the CAP Report Card is the only correct way to measure school performance — it is but one approach, and we admit to its limitations. But none of that, however, is good enough for the NEPC review.

The review also criticized the use of free-lunch eligibility data collected from the state to measure student poverty. We have repeatedly remarked on the limitations of these data, but the fact remains that there is a strong statistical relationship between free-lunch eligibility and student achievement at the school level. Yun does not offer an alternative way to control for socioeconomic status at the school level with publicly available data from the state of Michigan. We accept these limitations of the data and remind readers that free-lunch eligibility “is the best available proxy for student socioeconomic status.”

The CAP Report Card’s statistical model relies heavily on this free-lunch eligibility variable. Yun’s review claims that this is insufficient and criticizes the methodology for not including more variables. He posits that our model would be improved if we used data about a school’s reduced-price-lunch eligibility, racial composition, English language learners, percent of special education students and more. Because the model doesn’t include these variables, Yun declares the results "questionable." He asserts that their inclusion “could easily result in a large change in this index.”

What Yun apparently did not realize is that previous editions of this report card tested these same variables (and more) and opted not to include them in the model for very good reasons that are explained in these reports. When testing the impact of adding these other variables, we found that “the improvements in predictive power [of the model] are marginal, and including the additional variables would have only increased the model’s complexity for little measurable gain.” In other words, we opted against complexity for complexity’s sake. The bottom line is that even if we had followed Yun’s advice and added these variables into the model, the results of the report card would not have been significantly different, so his concern that there’s an “omitted variable bias” in the results is not as troublesome as it is made out to be.

To be fair, the latest edition of this report card does not explicitly mention this discussion from previous editions. That’s because we don’t think it’s necessary to describe in each edition the different iterations of the model that we have tested and tried over the years. On the other hand, had Yun reached out to us or more thoroughly reviewed the previous editions of the report cards, this misunderstanding may have been avoided.

We stand by the decision not to include those variables and in a previous edition of this report card articulated why we think the analysis is still of use: “Moreover, CAP Scores do not adjust for every factor that may lie outside a school’s control. Nevertheless, by accounting for a major factor that is substantially related to student achievement and outside a school’s control, the CAP Report Card improves the quality of information publicly available to Michigan parents, taxpayers, school officials and policymakers.”

A large portion of the NEPC review dings the CAP Report Card for not including discussions about certain issues. In many of these instances, Yun is essentially faulting the report card for not being something that it was never intended to be. Further, he does not explain why including those discussions are necessary for the purpose of the report. Here are a few examples of these types of criticisms and explanations for why there are off base:

The Mackinac Center welcomes critical reviews of its work. This helps make our research better. Unfortunately, this recently published review by NEPC fails to offer a substantive critique that could be used to improve the CAP Report Card. Most of its criticisms are about issues that we either have no control over, have already considered and explained previously why we opted for a different method, or are simply criticizing the report for not being something that it was never intended to be. As is, we’ll continue developing these report cards that help provide a fuller picture of relative school performance in Michigan.