Every attempt to manage academia makes it worse
by Mike Taylor, blog post, 2017
I’ve been on Twitter since April 2011 — nearly six years. A few weeks ago, for the first time, something I tweeted broke the thousand-retweets barrier. And I am really unhappy about it. For two reasons.
First, it’s not my own content — it’s a screen-shot of Table 1 from Edwards and Roy (2017):
Table 1: Growth of Perverse Incentives in Academia
(from Edwards & Roy 2017, Academic Research in the 21st Century: Maintaining Scientific Integrity in a Climate of Perverse Incentives and Hypercompetition. Environmental Engineering Science 34(1):51-61.)
Perverse Incentive | Why it exists | What it incentivizes | Negative consequences | References / Examples |
---|---|---|---|---|
Publish or perish | Career advancement is tied to publication records | Quantity over quality; salami slicing | Fragmentation of research; publication of trivial or redundant results | Fanelli 2010; Edwards and Roy 2017 |
Novelty bias | Journals and funders prefer novel, exciting results | Risky research, novelty over rigor | Irreproducibility; neglect of replication studies | Smaldino and McElreath 2016 |
Positive results bias | Positive results are more likely to be published | Selective reporting; p-hacking | Publication bias; overestimation of effect sizes | Ioannidis 2005; John et al. 2012 |
Impact factor obsession | Journal prestige judged by impact factor | Targeting high-impact journals; chasing citations | Gaming metrics; neglect of important but less flashy research | Casadevall and Fang 2014 |
Funding tied to past success | Funders prefer proven researchers and topics | Conservatism; risk aversion | Stifling of innovative or high-risk research | Alberts et al. 2014 |
Career progression based on metrics | Hiring and promotion based on metrics (e.g., h-index) | Focus on measurable outputs | Narrow focus; gaming metrics; discouragement of collaboration | Hicks et al. 2015 |
Peer review conservatism | Reviewers favor safe, mainstream ideas | Risk aversion in peer review | Slow innovation; bias against novel ideas | Lee et al. 2013 |
Short-term project funding | Funders prefer short-term, deliverable-focused grants | Short-term goals; neglect of long-term projects | Fragmented research; lack of continuity | Fortin and Currie 2013 |
Incentives for quantity | Publication count drives career and funding | Quantity over quality | Proliferation of low-quality or redundant publications | Edwards and Roy 2017 |
Lack of negative results publication | Journals prefer positive outcomes | Suppression of negative or null results | Publication bias; distortion of scientific record | Dwan et al. 2013 |
And second, it’s so darned depressing.
The problem is a well-known one, and indeed one we have discussed here before: as soon as you try to measure how well people are doing, they will switch to optimising for whatever you’re measuring, rather than putting their best efforts into actually doing good work.
In fact, this phenomenon is so very well known and understood that it’s been given at least three different names by different people:
- Goodhart’s Law is most succinct: “When a measure becomes a target, it ceases to be a good measure.”
- Campbell’s Law is the most explicit: “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”
- The Cobra Effect refers to the way that measures taken to improve a situation can directly make it worse.
As I say, this is well known. There’s even a term for it in social theory: reflexivity. And yet we persist in doing idiot things that can only possibly have this result:
- Assessing school-teachers on the improvement their kids show in tests between the start and end of the year (which obviously results in their doing all they can depress the start-of-year tests).
- Assessing researchers by the number of their papers (which can only result in slicing into minimal publishable units).
- Assessing them — heaven help us — on the impact factors of the journals their papers appear in (which feeds the brand-name fetish that is crippling scholarly communication).
- Assessing researchers on whether their experiments are “successful”, i.e. whether they find statistically significant results (which inevitably results in p-hacking and HARKing).
What’s the solution, then?
I’ve been reading the excellent blog of economist Tim Harford, for a while. That arose from reading his even more excellent book The Undercover Economist (Harford 2007), which gave me a crash-course in the basics of how economies work, how markets help, how they can go wrong, and much more. I really can’t say enough good things about this book: it’s one of those that I feel everyone should read, because the issues are so important and pervasive, and Harford’s explanations are so clear.
In a recent post, Why central bankers shouldn’t have skin in the game, he makes this point:
The basic principle for any incentive scheme is this: can you measure everything that matters? If you can’t, then high-powered financial incentives will simply produce short-sightedness, narrow-mindedness or outright fraud. If a job is complex, multifaceted and involves subtle trade-offs, the best approach is to hire good people, pay them the going rate and tell them to do the job to the best of their ability.
I think that last part is pretty much how academia used to be run a few decades ago. Now I don’t want to get all misty-eyed and rose-tinted and nostalgic — especially since I wasn’t even involved in academia back then, and don’t know from experience what it was like. But could it be … could it possibly be … that the best way to get good research and publications out of scholars is to hire good people, pay them the going rate and tell them to do the job to the best of their ability?
Original version at the Sauropod Vertebra Picture of the Week blog