Why incentive pay won’t fix education or health care
It turns out — surprise! — that it’s really hard to measure quality in complex social systems and that employing simplistic quantitative measures can backfire.
That’s the take-home message from a recent talk by UC Berkley economist and public policy professor Jesse Rothstein who came to SFU to present his latest research on using standardized test scores to measure teacher effectiveness in the US.
Prof. Rothstein was involved in a 3-year pilot project in Tennessee designed as an experiment to check whether offering teachers bonus pay improves students’ test score performance. Teachers were randomly assigned into two groups, one was offered bonus pay if their students did well on standardized tests (experimental group) and the other one wasn’t (control group).
After 3 years, there weren’t significant differences in student achievement on standardized tests between the two groups, clearly showing that offering teachers bonus pay did not improve student achievement. Yet, the Obama administration (for which Professor Rothstein worked recently) is continuing to explore incentive pay as a way to improve the education system.
Come to think of it, the idea of incentive pay has become the holy grail in governments’ quest for improving the performance of complex social systems like health care and education. On the surface, there’s a certain intuitive appeal to the idea of paying more to those doing a better job. The “economic theory” behind it is that offering to pay people more for doing a good job will lead to increased work effort as rational individuals choose to maximize their pay.
That’s what’s driving US policy makers to test schemes of offering teachers incentive pay to improve school achievement. That’s what’s driving BC Health Minister Kevin Falcon to offer hospitals funding based on the number of surgeries they do (what he calls patient-focused funding).
But when used mechanically — by tying incentives to some quantitative measure of performance, like test scores or number of surgeries done — such schemes are likely to fail.
The devil — as usual — is in the details. And the details are that before policy-makers can give somebody a bonus for doing a good job, they need to be able to measure what a good job looks like. This is where standardized tests come in along with other quantitative measures such as the number of medical procedures performed, or the length of hospital stay.
But it turns out — surprise! — that it’s really hard to measure quality in complex social systems and employing simplistic quantitative measures can backfire. In fact, Prof Rothstein quoted an obscure scholar of methodology by the name of Donald Campbell, who coined a rather pessimistic “Campbell’s law” in the late 1970s:
“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”
In other words, the perverse incentives that pay-for-performance schemes create in complex social systems may well outweigh any positive incentives for real improvement.
Campbell looked at education in particular (see his working paper here), and argued that
“achievement tests may well be valuable indicators of general school achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.
Disappointing news for economists, and a reminder to acknowledge the limitations of the quantitative methods we, as a profession, so often peddle to policy-makers.
What does this tell us about BC’s new incentive-based funding model in health care?
Kudos to the SFU Centre for Education Research and Policy for organizing the public event.
Ironically, this mirrors the problems that used to be cited when I was in school (back in the middle ages) with incentives in soviet-style planning systems. The standard example was a factory producing nails. If the quota is based on weight, the factory will produce the largest nails possible; if volume, the smallest; if there is no constraint on inputs, you have high waste rates, etc. Incentives based on simple measures distort the process they are supposed to be facilitating.
Which still leaves open the difficult question of how you make quality judgements and practical decisions without the crutch of the all-powerful market.
This also reminded me about this TED talk about motivation – http://www.ted.com/talks/dan_pink_on_motivation.html
Below is a link to a great video on what motivates people, which seems very pertinent to incentive pay:
If we combine Iglikaâ€™s post with Marcâ€™s video link and reflect back on Mankiwâ€™s now infamous article in the NYT we are left with one of three conclusions.
A) Mankiw is wrong.
B) The type of intellectual work he does is akin to basic mechanical manipulation of say moving a mountain of manure from spot X to spot Y.
C) A & B are correct if, and only if, Mankiw does not generalize from what he does for work to what real professionals do for work.
I notice those two video links are to very different-style presentations by the same guy. The first is of a stage presentation in a big hall, the second is talk accompanied by whiteboard drawing. I like the second better. Either way, I’ve heard this stuff from other sources as well. The crux–best motivation for most tasks is not reward, but autonomy, mastery, and purpose.
This suggests that the answer to Mr. Erlichman’s question is to instead ask the nail factory people to do a good job making nails, because there are people out there who will need nails. Give them autonomy, and give them purpose. They will then work to be the best nail-producers they can be, independent of quotas.
Also suggests that not only is the incentive thing for health care bad because of measurement problems, it’s also bad because it’s a terrible motivator. So it will fail at producing the right behaviour even as it succeeds in producing the wrong behaviour.
The deeper problem is something this Dan Pink guy doesn’t, probably can’t, realize. He notes that there’s a deep mismatch between the science and business practice, and wonders briefly why. He also notes that monetary reward does work for certain narrow, routine tasks and talks about how many “20th century” tasks fit that description while “21st century” tasks do not. He seems to believe that it’s just a matter of time before thinking catches up.
But it isn’t. There’s a reason those “20th century” tasks were narrow and routine, it’s called Taylorism and it’s all about control over workers. Those tasks were carefully arranged to be narrow, routine, controllable and measurable. And business gurus have been rediscovering the importance of employee autonomy and creativity for decades. It has never stuck and it can never stick. It is utterly important to capitalism (not markets, but capitalism in the sense of private ownership and control of firms) that production is ordered hierarchically and that there be strong reasons to give more capital to those at the top. If reality and science do not co-operate in providing claims on capital for those at the top, such as incentive bonuses, reality and science will be ignored in favour of just-so stories that will get the job done. If reality and science do not co-operate in amplifying the importance of management and control to production, again they will be ignored.
In the end, these studies of motivation amount to an argument for an essentially socialist or anarchist kind of production (again, doesn’t say anything about markets, but it does say something about organization of work). Not USSR top-down “socialism”, but socialism in the sense of people having control over the work they do. And indeed the record of worker-controlled co-ops, such as the factories taken over by workers in places like Argentina, is generally quite good.