Wednesday, December 11, 2024

It's Time to Stop Using Grades

We use grades to evaluate students and motivate them to learn. That works as long as grades remain a reasonably good measure of how well the student understands the material in a class. But Goodhart's law, "When a measure becomes a target, it ceases to be a good measure," cannot escape even this most basic of academic measurements. Grades become irrelevant or even worse, counterproductive, as chasing grades may undermine a student's ability to master the material. So perhaps it is time to retire the measure.

Grading became a weaker measure due to grade inflation and academic dishonesty. Let's do a short dive into both of these areas.

The average grade has increased about a full grade level since I went to college in the '80s, and now more than half of all grades given are A's. As college tuition increased, students started thinking of college more transactionally, expecting more from their college experience while putting less effort into classes. Administrators put more weight on student course surveys for faculty evaluation, and the easiest way to improve scores is to give higher grades. And repeat.

If everyone gets an A, no one gets an A. It just becomes harder to distinguish the strong students from the merely good.

Academic dishonesty goes back to the beginning of academics but has advanced dramatically with technology. In my fraternity, we had filing cabinets full of old homework and exams ostensibly to use as study guides. However, if a professor reused questions from year to year, one could gain an unfair advantage.

With the growth of the Internet, Chegg, and more recently large-language models, those looking for an edge never had it so good. ChatGPT-4o1 can answer nearly any undergraduate exam question in any field—it even got an easy A when I tested it with one of my undergraduate theory of computing finals.

AI becomes like steroids: those who don't use it find themselves at a disadvantage. If a pretty good student sees their peers using LLMs, they'll start using them as well, initially just as a learning aid. But there's a very fine line between using AI as a study guide and using AI to give you the answers. Many fall down a slippery slope, and this starts to undermine the mastery that comes with tackling problems on your own.

We can try and counter all this by returning to harsher grading and more heavily weighting in-person, no-tech exams, but these approaches cause other problems. Already we see companies and graduate schools devalue grades and focus on projects and research instead.

So let's acknowledge this endgame and just eliminate grades, maybe keeping only Pass and Fail for those who don't even show up. The ones who want to master the material can focus on doing so. Others can concentrate on working on projects. Still others can earn their way to a degree with little effort but also with little reward.

14 comments:

  1. If grades depend only on in-class exams, then the power of AI is irrelevant. But giving credit for homework assignments is indeed over.

    ReplyDelete
  2. AGREE on all points. I'd be curious about alternatives. P/F? Some schools already don't have grades but have professors write a short paragraph about the students. A real pain for gradute admissions, but perhaps better. Then again, shy students may get downgraded unfairly. Projects may be a better alternative (writing a compiler shows mastery far better than answering a question on an exam or HW on parsing) but can AI also do projects? Probably. Have teachers ask questions that are AI-proof? That is possible for some couress now but will get harder over time.

    ReplyDelete
    Replies
    1. I'm teaching four classes this semester. Two of them started with 79 students each (starting at 80 they have to pay me more). I cannot write a meaningful paragraph about each one, or indeed a tenth of them.

      This is one of those places where because other people are unaware of or ignore the problems with remote exams, etc., the administration requires the standard of classes that are impracticably large to grade by hand. It is different law, Gresham's Law.

      Delete
  3. One year, when I was a graduate student, we graded the Calc III class just on the final exam. This was a mistake necessitated by the professor giving two one-hour exams that each had a standard deviation of 1 (one had a mean of 30, the other 90; you can guess how happy the students were after each). We drew a large histogram on the blackboard of the scores on the final exam, then marked off the letter grades at reasonable percentiles. If people aren't going to use the customary percentiles, then they should report what percentiles they are using (or explain how they know all their students are better than expected). Of course, they won't.

    ReplyDelete
  4. The blog Grading for Growth is an excellent resource for those who are interested in alternative grading practices: https://gradingforgrowth.com/about
    I found many good ideas there that I’ve started using in my own courses.

    ReplyDelete
  5. The grade inflation part is easy. To combat it, since the 1980s, University of Toronto transcripts have shown the course average for every undergraduate course with 12 or more students on student transcripts. (Grades there are numerical out of 100.)

    ReplyDelete
  6. My paper, "A Triage Theory of Grading" (Teaching Philosophy 34 (2011); https://cse.buffalo.edu/~rapaport/Papers/triage.pdf, argues for a minimal grading scheme whose main purpose to give students information about their progress. It's basically pass/fail, with an intermediate position to cover all cases that are not clearly pass or clearly fail.

    ReplyDelete
  7. It is a short conceptual distance from having no grades to having no Bachelor degrees. Some universities could still exist as research institutions I guess. You can decide for yourselves whether you consider this to be a bad outcome.

    ReplyDelete
  8. At the company where I work, I sometimes interview people that we are considering hiring. After talking to them for ten minutes, I usually think I know whether they could do the work. (We've hired some of these people, so I have further evidence that my judgment was correct.) If you have a student for a whole semester, you can't figure out whether they know the material?

    Freshman year in college, I took an Honors Calculus course. The professor assumed that the weak students would not continue with the Honors sequence in the spring semester. So, he didn't give any exams in the spring semester. By the end of the spring semester, he realized that he couldn't give everyone in the class an A. (Not sure why he didn't realize it sooner, since he had us all in the fall semester.) So, he gave an oral final exam. We all met in his office, and he asked us questions for an hour. At the end, he said, "You two get A's. You two get B's. You get a C."

    ReplyDelete
  9. Back when I was a TA (teaching assistant) in graduate school, some students were clearly copying homework and problem sets from each other. This isn't a new problem created by ChatGPT. We could encourage students to learn, but we couldn't force them.

    When I was TA-ing Calc II, after the first hour exam, the students were so upset that we had a meeting of the professor and all the TAs to discuss it. The students had two complaints: the exam was too long and the questions were not the same as on the homework. Of course, the reason the questions weren't the same as on the homework was because we weren't interested in how good their memories were. The reason they thought the exam was too long is more complicated, but was also related to their attempt to pass the course by memorizing the answers to all possible exam questions. Of course, to a mathematician, this seems like a very inefficient way to pass a course compared to simply understanding the material.

    ReplyDelete
  10. Without grades the students who could learn if they invested the time will have no reason to. Things will get worse than they have gotten.

    ReplyDelete
  11. "If everyone gets an A, no one gets an A."

    Isn't the better solution for leadership to crack down on that nonsense?

    ReplyDelete
    Replies
    1. Unless of course you are teaching a class of geniuses that are all exceptional calibre! The exception that defines the rule … but I guess these are rare tail events.

      Delete