What can add five years to your life without lengthening it by a single day, improve the performance of the NHS while wasting its resources, and make people grateful for unnecessary suffering?
The answer is early diagnosis.
Early diagnosis sounds like a no-brainer. The NHS has targets for speedy diagnosis and treatment of cancer, and offers screening for a wide range of conditions, in babies, children and adults. Private medicine and charities offer even more.
Much of this early diagnosis can, without doubt, be a very good thing. What’s surprising is that in the wrong hands or the wrong circumstances, it can also be a very bad one. To make matters more complicated, sometimes it can be hard to tell the good early diagnosis from the bad. The story of our attempts to put the logic of early diagnosis into practice is a hard lesson in the human costs of failing to communicate that uncertainty, and the dangers that lurk in the intuitively obvious.
To see how easily we can be misled, start with a definitional trick that makes early diagnosis appear a good thing, simply by moving the goalposts. Imagine your cancer is diagnosed at a late stage, aged 66, and you die from it four years later aged 70. Your cancer might have been sitting unrecognised for years. If only you had known.
Then, in another life, your wish is granted. This is the parallel world which, in a recent speech about how AI would soon tackle cancer, Theresa May held up as her vision of the future. “Every year, 22,000 fewer people will die within five years of their diagnosis compared to today.” So thanks to screening, you get your diagnosis earlier at, say, 62.
But now suppose that’s all it is—an early diagnosis. And suppose there’s no treatment for it or the treatment is unsuccessful (as we’ll see, this turns out to be a common scenario). You still die at 70, having lived not a day more. But fear not; because you become one of those who do not “die within five years of their diagnosis,” you can still be counted as someone who lives longer.
By moving the diagnosis forward, but leaving the date of death unchanged—hey presto!—the five-year survival after diagnosis goes up. “Lead-time bias,” as this is known, means that claiming success for cancer detection programmes with unadjusted rates of five-year survival after diagnosis is, to put it mildly, an inappropriate use of statistics. Some people—in Public Health England, for example—are aware of the problem and take pains to avoid it. Others—including some doctors and health organisations that have promoted screening—are less careful.
Long lead times could partly explain the belief in the US that wider use of PSA tests to spot prostate cancer early, through routine screening of men who are asymptomatic, reduces mortality—as US mortality rates appear significantly lower than in the UK. But PSA blood tests can pick up slow-growing cancers which might never do harm. Detect these, and it looks as if you’re stopping people dying who weren’t going to. The largest-ever prostate cancer trial, run for 10 years by Cancer Research UK-funded scientists among 400,000 men, found that inviting those without symptoms to a one-off PSA test did not save lives, and could cause harm through treatment of those needlessly diagnosed.
One issue, then, is that we can be fooled by the data. The next is the more substantial question of the date of death itself. If early diagnosis through screening doesn’t change that, then much of its promise is false, and any claimed improvement in the statistics would be a lie. Does it have any effect? As that cross-Atlantic comparison of prostate cancer suggests, the answer is “not necessarily.” In truth, all some people gain is more years knowing they have cancer. The earlier diagnosis and treatment has no clear effect overall on the age they die.
And yet, the logic of “catching it early” suggests that it must make a difference. The chance of stopping cancer, for example, is genuinely greater if it’s detected earlier and more amenable to treatments, above all surgery, which we know often become less feasible after it has spread. What’s more, many thousands of women and men affirm, with all the authority of personal experience, that they were themselves saved because screening discovered their illness. It’s hard to argue against that sort of testimony. Not so very long ago, an American leaflet advising women about breast cancer screening could say that if you didn’t go, “you need more than your breasts examined.” UK advice could be similarly bullish.
But that confidence now looks awkward, to say the least. Although there have always been critics in an often-acrimonious debate, screening programmes face growing scepticism. A recent conference, co-hosted by Cancer Research UK’s Early Detection Centre, concluded as follows: “Decades after their implementation, cancer screening programmes carry the burden of unresolved ethical issues and questionable outcomes… Expert consensus is that these programmes should be reconsidered as they are not up to date with our knowledge of the natural history of disease.”
It’s a startling position, and will seem sudden and inexplicable to some who remember the old campaigns. What does it mean to say these programmes should be “reconsidered”?
This shifting opinion has unsettling implications for how people make sense of evidence, especially the evidence of their own intuition and experience. It also suggests that for decades many hundreds of thousands of patients, and many thousands of doctors were mis-sold screening using the shaky logic of early diagnosis. Many will have been exposed to harm of which they had no inkling, to no purpose, and then—of all cruelties—felt grateful.
There’s a vital distinction here. If you have symptoms—a lump, say, or a bleed—you don’t need screening, you need a proper examination and diagnosis, and soon. That’s a type of early diagnosis that makes abundant sense. Similarly, if you have a strong family history of a disease, it pays to watch out.
Screening, on the other hand, attempts early diagnosis of people who are symptom-free. Unsurprisingly, it is not perfect. Interpreting white marks on a mammogram for example can be an uncertain art. How those white marks will behave in future is in some cases unknowable. All this means that early diagnosis also entails early misdiagnosis—both by telling people they are well when they’re not, and by telling them their condition is more serious than it will really turn out to be. With the latter comes the possibility of unnecessary anxiety, treatment—and harm.
The diagram overleaf shows the outcomes for women who are and are not screened for breast cancer. These are now reported in the latest NHS breast screening leaflet—a big improvement on the past. Only by comparing the outcomes like this do we see how both benefits and harms play out for everyone. The bottom-line benefit of screening seems clear. In the screened group, three in 200 die of breast cancer; in the unscreened group, four.
But the harm is less obvious. To see it, focus on the unscreened group on the right, where three women are shown as “unaffected.” These are women who have cancer that will never develop and never hurt them. Often, it’s carcinoma in situ—the “in situ” meaning that it stays put in the ducts where it begins and never affects the rest of the body. If they are not screened, they never know, never worry, never have treatment—and they’ll be fine. But if these women are screened then, since we don’t know if it’s carcinoma in situ, or a cancer that could kill, they’re likely to be treated—including with mastectomy surgery. In some cases, then—and there’s no knowing which—early diagnosis will be harmful.
One way of summarising these group-level statistics is to say that screening saves one from breast cancer but treats three unnecessarily. Is it worth it? Many will still say yes, unequivocally. Better the chance of living, even with the risks of over-diagnosis.
But it is not clear how we should weigh-up potential benefit and harm, especially as the benefit—one in 200 saved—also needs qualifying: one in 200 is saved from dying of breast cancer. When comparing mortality from all causes, however, even that benefit is questionable. It’s not clear if either group—screened or unscreened—lives longer on average. Why not is uncertain—but it seems that the screening itself carries a risk of causing other kinds of harm. We can speculate that treatment that would otherwise have been avoided, or even the screening itself, is taking a toll on other aspects of health.
A sceptical account
All of which leaves a dilemma: if breast cancer screening saves some, while harming others—the net effect on all-cause mortality more or less cancelling out—what do we do?
David Spiegelhalter, a statistician at Cambridge, was part of the team that changed the breast cancer leaflet to make it more balanced. Last year, he set up the Winton Centre for Risk and Evidence Communication partly inspired by that experience and a sense of the need for good evidence without persuasion.
“Earlier diagnosis of a serious disease might be beneficial for patients,” he told me, “but this should not be a foregone conclusion. There is a need to consider the overall impact on everyone affected, including those that would normally not have been detected, and those for whom earlier diagnosis simply means more time knowing they have the disease. Unfortunately, none of these people can be directly identified.”
At the end, a sceptical account of the balance in this case of screening might read: a life saved, a life lost, a great many needless treatments, a whole programme at great cost, the dedication of many professionals for many decades, the anxieties of many women… for what: a shuffling of fates, a lottery?
Faced with such uncertainty, we need to understand why some of this evidence tends to be heard, and some isn’t.
To see how, go back to the chart (above) and the group on the left—the 200 screened women—and the 12 who are treated and survive: we instinctively attribute their survival to the screening—and so do they. I had cancer; I was screened; they found it; I was treated; I lived.
Then look at the unscreened group. Eight are also treated and survive, perhaps because they find a lump themselves, while another three have cancer which is never found, but are unaffected, as we heard.
So, 11 of the unscreened group with cancer (8+3) come through fine, while 12 of the screened group do (three of them having unnecessary treatment for cancers that wouldn’t have affected them).
But what do these people say about their experience? All 12 of the screened group might say screening saved them. If they did, all but one would be mistaken, while three might think they were saved when in truth they were harmed unnecessarily.
By looking at the chart and asking what story each person might tell, we see how every story about early diagnosis will seem to confirm its worth, how no stories are told to doubt it, how even bad outcomes are interpreted favourably. It’s known as the popularity paradox: “The greater the harm through overdiagnosis and overtreatment from screening, the more people there are who believe they owe their health, or even their life, to the programme.”
Will any of those unnecessarily treated ever ask: “Would I have been fine if left alone? Would I have avoided the needless pain and anxiety of this whole procedure?” This will be the real story for some, but never heard. No one knows if it is theirs.
This is the recurring pattern, everything in favour of one side: experience, intuition, real individuals—celebrities perhaps—with names and captivating narratives, tragic or inspiring, full of conviction, but often wrong. On the other side, group-level statistics, anonymity and silence. Is it any surprise which dominated?
"False positives are rife"
Is breast cancer a special case? Not necessarily. It’s a problem that’s characteristic of treatment for all manner of conditions: the intuitive but not-always reliable habit in those who get better—and those who treat them—to thank the treatment.
Underlying all these problems is another; the simple fact of uncertainty. As we become more aware of screening’s untold stories, this increases the uncertainty about which story might be yours. Well, I say uncertainty is a simple fact, but in a letter to the London Review of Books in 2015, a professor and director of a university dementia services centre had no time for it. “The sooner someone sues a GP for failure to diagnose [dementia] as early as possible, the better,” she said. She wrote after it emerged that there were widely varying rates of diagnosis for dementia in different parts of the country. It was outrageous, she said, that publicly-funded professionals could withhold such “information” from their patients. Amen to that, you might say.
But in reply, Margaret McCartney, a Glasgow GP who writes about evidence-based medicine, called the professor’s view disturbing. The UK National Screening Committee recommended that screening should not be offered for dementia, she said. Why not? “Because false positive rates”—when an uncertain test tells you that you have an illness when you don’t —“are rife.”
Imagine 100 65-year-olds. Typically, you might expect to see six cases of dementia. “If the background rate of dementia for 65-year-olds is 6 per cent, screening would find four of the six,” she says. That is, being imperfect, it wouldn’t catch everyone.
But it would also catch some healthy people by mistake: “23 other people would be told they have dementia when they do not [a false positive]. This is not just bad medicine, but harmful medicine…” she said.
If we add the four accurate positives to the 23 false-positives—and remember we’ve no idea which is which—that’s 27 out of 100 people who a legally-nervous doctor might feel obliged to warn of possible dementia; 23 of these 27 would be misinformed. The anxiety for them and their families, the changes they might make to their lives—do they continue driving? Do they feel safe alone?—are awful to contemplate.
Some professors it would seem, just like their patients, might think in terms of hard information and wonder what could be wrong with demanding it. They might imagine compelling stories. The power of these narratives, the degree of conviction they inspire—“sue the GPs!”—is in one sense impressive, but equally it is alarming: a one-eyed story that sees only benefit, but fails to see potential harm, or the uncertainty in both.
Not all screening is as hit and miss as for dementia. Different types have different potential harms, and different balances of harm and benefit. The repeated point is that the intuitive benefit of “catching it early” does not apply equally in all circumstances.
Is there also another bias lurking here: that in response to a problem we must do something? Wherever this something seems to have an intuitively plausible rationale, arguments quickly fall into place, evidence is selectively advanced, stories that confirm it predominate; stories that don’t are often untold.
In early May, it was announced that 450,000 women had failed to receive letters inviting them to breast cancer screening. Jeremy Hunt, the then-Health Secretary, said that up to 270 could die early as a result. He might have been premature. His own estimate has reportedly now fallen from up to 270 deaths to up to 74. (An unusual example of a minister rushing to highlight bad news, before checking if it stacked up.) No matter. Charities called it “distressing,” “devastating,” an “appalling error.”
But amid understandable public alarm, there was, perhaps, one good to emerge: the full balance of the potential benefits and harms—the group-level statistics—broke through into public argument.
“I believe that the furore over the NHS Breast Screening Programme error has resulted in unnecessary worry for those women who may not have received a breast screening invitation,” said Paul Pharoah, an epidemiologist at Cambridge.
“Breast cancer screening has both harms and benefits and so, if some people have not been invited for screening they will have avoided the harms as well as missing out on any benefits.”
For the particular women concerned, aged around 70, he added that we have little good evidence of the overall effect of missing one final check-up. This is the measure of our communication failure: that a great many people feel justifiably frightened and some feel outraged because we led them to believe that an inviolable practice had been violated—but it is a practice whose effects we never understood.
Should I go or not?
My partner, Katey, is in her early 50s. “So, what am I supposed to do?” she said, as we discussed this article. “Should I go or not?”
Alas, there is no “should.” I’m in no position to make the decision for anyone else—and the data doesn’t tell us.
“I know… that’s what I don’t like,” she said.
The early campaigns, said Spiegelhalter, “were deliberately aimed to increase the number of people going for screening, and were designed at a time when screening was seen as an unalloyed ‘good thing.’”
Now that over-diagnosis and over-treatment have become recognised problems, he says, the ethos has changed—at least in places—towards encouraging an informed choice.
“Technically, we might say that the decision problem has entered a 'preference-sensitive zone'—meaning that two reasonable people, faced with the same information, could come to different decisions about going for screening.” Hunt’s successor, Matthew Hancock, could take note.
We tell stories to make sense of life, or maybe to pretend that life makes sense. But we’re often tempted to make clearer sense of the data than the data contains, as we’ve done with the numbers on screening. Facing up to the limitations of our knowledge leaves us feeling vulnerable.
But we always were; we just weren’t told. The choice—for it is a choice—is not black and white. The intuitively-obvious—“go or you need your head examined”—has been a damaging delusion through which medicine has again proved capable of hurting and misleading people on a huge scale.
This article was amended on 16th September 2021 to remove a statement that UK prostate cancer mortality rates were the same as those in the US, contradicting claims by former New York mayor Rudy Giuliani.