The whole point of the article I quoted in the OP, the reason I found it so interesting, was that unless you remove "money spent" and account correctly for heterogeneity, the "national scorecard" interpretations by many (most?) of the major reviewers is just plain wrong.
From the OP link... emphasis mine.
This is starkly illustrated by comparing Texas and Iowa. According to U.S. News and World Report, Texas, which ranks 33rd, is far surpassed in educational quality by Iowa, which ranks eighth. When only the test scores are examined at an aggregate level, the ranks shift somewhat but their relative positions don't: Texas moves to 35th and Iowa to 17th. But when we disaggregate student performance scores by racial categories (white, black, Hispanic, and Asian), the rankings change dramatically.
By looking at test scores for students in fourth and eighth grade in math, reading, and science, and by separating students by racial category, we get 24 different possible bases of comparison. This allows us to measure how well states do for each specific student type—Asian fourth-grade math students, for instance. (We have adjusted our rankings to compensate for the fact that not all states report scores for every student group.) Giving each type equal weight, Texas comes in fifth and Iowa 31st—a remarkable reversal.
Iowa, it turns out, falls so far because it does a below-average job of educating white students (30th in the country), black students (36th), and Asian students (40th), although it is slightly above average with Hispanic students (20th). Because Iowa has a disproportionately large share of white students, who as a group score higher than blacks and Hispanics, rankings that use aggregated test scores place Iowa's education system as above average and superior to that of Texas. Yet Texas students score higher than Iowa students in all but one of the 20 possible bases of comparison between these two states.