Friday, May 30, 2008

Exam Design and Grading

Matt posted a partial solution to a basic conservation of energy problem that is part of a set of bogus solutions that has made the rounds over the years. My comment focused on the digression into whether the sample solution really deserved a zero, and some other things about grading an exam, but Matt's reply to another comment made me realize something more was involved than just how to assign partial credit.

Rather than run on at length there, I figured I should blog it here. I should also back-link to an earlier article of mine on assembly-line grading, since a similar approach was advocated by another comment in that thread. (A comment from someone who blogs about Mandelstam variables! Gotta add that in my blogroll right now. IMHO, "s" rules.)

The observation that made me realize more needs to be said was the following:

For this particular problem, either from memory or from a (probably large) formula sheet the student did manage to pick the two right equations and only those two. ... It is likely that my 5/10 is too generous, but either way it’s probably still a D at best.

I would be concerned that 5/15 for this problem, although an F for this particular problem in isolation, might accumulate into a score that does not reflect the physics knowledge of this student. (Of course, it is highly probable that even liberal partial credit will result in a score of 35 or 40% for the entire test for this student.)

What they brought with them:

I don't allow formula sheets or the functional equivalent (a calculator with formula sheets stored in it). If I did, there would be zero partial credit for this problem because the student did not have to pull U = mgh out of his or her head and, more importantly, because the student did not pull the ONE right equation (Ef = Ei + W) off of that sheet. Without one, I might give one point for knowing to calculate the potential energy but not knowing what units it has.

Exam Design:

Problem grading also has to be considered in the context of how you write the exam in the first place. You can write an exam consisting of fairly easy problems that are either right or wrong (this is often done in large classes with m.c. exams) and get a reasonable grade distribution if the students don't study very well or don't know how to use their calculator properly. But if the students are prepared, such an exam will not have a normal distribution. [My first exam in Physics 1 is such an exam, and the predictable result can be seen in a blog article linked below.] Worse, those kinds of tests don't require any critical thinking.

This particular problem is an extremely elementary problem. If (actually, when) I put it on an exam, it is part of the C-levels part of the test. I use a straight scale with no tolerance on the 70 line for a C, so the exams are matched to that objective while taking my historic partial credit policies into account. This is a problem where I would expect a C student should get all 15 points, losing at most three points for leaving off units or making a computational error that would have to be made up elsewhere on the test (by getting somewhere on an A or B problem) to pass the test. I'd probably also take off 3 points for leaving off the Ef = Ei + W (or just Ef = Ei) starting point on part a. Someone who does not know they need to conserve energy deserves to lose all of those 15 points.

Note added for clarity:
The basic idea behind this exam design is to ensure that a student who deserves to pass can get about 70 on the test. That might mean 60 points of "C" problems, with most of the rest being "B+" problems. A C student is expected to get some partial credit from the B problems, and a B student is expected to miss some points on the B problems. I usually include one "A" problem, with significant critical thinking required, but limit it to 5% of the exam totals on average. This means that a perfectly prepared student can get an A without being able to do the sort of challenging problems they would see at Harvey Mudd, but that has proven more than good enough for the Wannabe Flagship U they mostly attend.

Partial credit is for harder problems, where the student might figure out what kind of problem it is but make a major error in setting it up and get nowhere as a result. Problems with a significant critical thinking component coupled to a complicated calculation. However, my own design decisions tend to decouple a problem with a complicated set up (say a vector integral for the electric field from a charge distribution) from its actual solution. The whole thing might be in the homework, but an exam question will stop with setup of the integral.

Grade distributions on tests:

Most of my exams are bi-modal. A non-representative sample for a pair of tests was shown in an article about the relationship between doing homework and doing well on exams that I wrote some months ago.

Concerning Matt's critique of the no-partial credit philosophy "No partial credit! You engineer, you make mistake, bridge fall down, people die!!" posted in the comments, which was
"I understand his reasoning, but also a working engineer is also able to consult references, double check math with computers, and discuss with a team.", I have three observations:

A working engineer has to pass a licensing exam where this particular problem would need to be answered correctly in about 30 seconds, to allow more than two minutes to solve a symmetric step ladder problem. That is not enough time to look up a formula for a trivial physics relationship.

The computer program can be wrong. GIGO can take several forms, including bad data tables hidden in the code.

If you rely on the team to catch your mistakes, who will catch theirs? The most recent example is the I-35 bridge collapse in Minnesota, which was the ultimate consequence of an elementary error made when the bridge was designed 40 years before. It is not yet known (and may never be known) where that error was made, but even if it was made by a draftsman there were multiple places where it should have been caught in a properly run professional organization.

I give partial credit, but that is coupled to an exam design where that credit is primarily for minor computational or algebra errors and should not allow someone to pass who has too much elementary physics left to be taught by their engineering professors. They have enough to do to teach the next level of physics and calculus encountered in actual engineering problems.

Link back:
Noticed this interesting discussion, Grading Policy, Sir! about teaching at the Navy nuclear power school in Orlando. I really like the use of a special grading shorthand described there!


Matt said...

I think I'm probably coming around to your point of view. As TA, I'm generally just responsible for the grading itself, but not how to grade which is written up in a grading rubric by the professor. These rubrics vary by the professor, but typically the major bulk of the credit is demonstrating knowledge of the solution method and of how to carry it out. My recitation quizzes are entirely written and graded by me, and I'm quite strict on those because they're designed to be easy enough to do in 3 minutes or less by someone who understands the material.

In things like English or graphic design there's plenty of room for opinion and variance. In physics and mathematics there's only one right answer in the nature of things. If you design things based on those answers, reality will object to any error in a potentially dangerous way. I wouldn't want to be the person to go too easy on sloppiness only to have nature crash a plane because of it later.

And it's unfair to the students to go later on into a more advanced class while not having mastered the prerequisites.

Doctor Pion said...

My personal problem is figuring out what the objectives are for labs. I'm going to get around to that discussion later this summer.

The key to a good quiz problem is designing it so it can be graded easily, with a focus on a particular skill. This discussion has me thinking about a problem that would give them the basic conservation of equation for this problem and ask them to define each term in it. (And include friction, to see if they know you add the work, which is a negative number.)

Regarding exam problems:
There may be only one right answer, but there can be many correct solutions. Defining required elements in a solution that gets full points is actually part of my teaching role, IMHO. It is something that has evolved a great deal from conversations with engineering faculty at Wannabe Flagship and my grads who have done quite well there.

For example, they can lose as much as 1/3 of the points for a problem if their free-body diagram is not drawn correctly and clearly with a straightedge!

Doctor Pion said...

To clarify: that last paragraph applies at the Wannabe Flagship engineering school, not on my exams!