How to Fix the SAT

By Ben Orlin Earlier this year, the College Board announced sweeping changes to the SAT. New vocab, less obscure. New essay, now optional. A repeal of the “anti-guessing” policy. A retooled and refocused math section. The board stopped just short of switching to lavender-scented paper and offering 10 p.m. to 2 a.m. testing sessions for the benefit of morning-averse 11th graders. But the College Board omitted the one reform I’d most favor. It’s a simple change that could help undercut the behemoth test prep industry and defuse some of the anxiety surrounding the test, and it wouldn’t require altering a single question. They just need to reduce the number of scores. I don’t mean “give out lower scores.” I mean “give out fewer scores.” Common sense and hard data dictate that there is no meaningful distinction between a 710 and a 720, or between a 560 and a 580. The SAT should embrace reality and stop assigning different scores to virtually identical performances. Decades ago, the SAT gave scores down to the point. George W. Bush allegedly scored a 1206. Bill O’Reilly landed a 1585. Eventually, the College Board decided that these pinpoint figures gave a deceptive sense of precision to their decidedly imprecise test. So they started delivering scores in increments of 10, a practice they continue today. But these increments are still too small. By the College Board’s own numbers, a student’s section score will fluctuate an average 20 to 30 points between testing sessions. In fact, it warns specifically against reading too much into small differences: “There must be a 60-point difference between reading, mathematics, or writing scores before more skill can be assumed in one area than another.” Whether we’re comparing students to one another or to their own performance in another domain, gaps smaller than 60 points are likely to be meaningless. So why report them at all? In his book Proofiness, Charles Seife gives this problem of deceptive over-precision a name: disestimation. He explains that it “imbues a number with more precision than it deserves, dressing a measurement up as absolute fact instead of presenting it as the error-prone estimate that it really is.” If you know that your bathroom scale often errs by 5 pounds, then you shouldn’t report your weight as “150.4.” You should say, “About 150.” And if this same volatile scale claims that I weigh 151.3 pounds, that doesn’t mean I’m necessarily heavier than you. The measurement isn’t that reliable. The best we can conclude is that we weigh roughly the same amount. On the SAT, scores for each section range from 200 to 800, in 10-point increments. That yields an incredible 61 different possible scores for each of the three sections. Thus, based on just an hour’s worth of multiple-choice questions, the College Board claims to divide high school students into more than five dozen different groups according to their mathematical, reading, and writing abilities. Compare that to a typical Advanced Placement exam. It takes three times longer than an SAT section, and it asks essay questions, graded by expert teachers. By any fair reckoning, this should supply richer data than the SAT, allowing for more fine-grained distinctions. And how many different scores does the AP assign? Just five. Read the full article here:

Tags: , ,

Trackback from your site.