■ site map
June 2010 - December 2013
II - Mammography
15. Mammography risks: Overdiagnosis and overtreatment
For nearly two decades since the introduction of screening mammography, these two words - overdiagnosis and overtreatment - were nearly unheard of in relation to it. If ever mentioned, it would be only as a couple more of unimportant ones among the rest of its "negligible" negatives.
Some major players in this arena maintain such position to this day. However, this "negligible negative" has all the chances to become that straw which will brake the camel's back, ending the era of screening mammography.
By definition, overdiagnosis is the rate of excess breast cancer (BC) diagnoses over the number of actual BC; in other words, it is the rate of abnormal growths diagnosed as breast cancer that would not become symptomatic during woman's lifetime. It is usually given in percents, vs. rate of actual BC given as 100%. So, if overdiagnosis rate is, say, 25%, it is with respect to the 100% actual BC figure, implying that 1 in 5 diagnosed women has pseudo-disease (according to a simple formula, 1 in 1+100/OR, OR being the overdiagnosis rate in percents). Such apparently malignant growths, including both, those diagnosed as in situ and invasive,
wouldn't have progressed, or would regress/vanish
Overtreatment, on the other hand, is simply any unnecessary treatment, regardless of the presence, or nature of abnormal growth. Thus, overtreatment is not necessarily related to overdiagnosis, i.e. erroneous diagnosis of breast cancer. It can also result from unnecessarily extensive, or invasive treatment of an actual BC or, not seldom, from preventive, just-in-case treatments of only suspicious growths. But most of it does result from erroneously diagnosed BC.
What is the estimated magnitude of overdiagnosis due to screening mammography? The USPSTF maintains to this day their "research-based" figure of overdiagnosis due to the screening being probably in the 1-10% range (Screening for Breast Cancer: Systematic Evidence Review Update for the U. S. Preventive Services Task Force, Nelson et al. 2009). Is USPSTF a bit selective of the evidence in its "evidence-based" review? Let's see.
As usual in the mammography screening context, most studies belong to one of the two opposing groups: one that finds the effect significant, and the other that finds it negligible. If it is a negative, as in this case, then the majority of studies that find the effect negligible are authored by well known pro-screening researchers.
Here are those from the USPSTF reference list that find overdiagnosis rate negligibly low:
⇩2003 - 5% overdiagnosis rate for carcinoma in-situ (CIS) diagnoses at the incident (i.e. first) screen (Quantifying the potential problem of overdiagnosis of ductal carcinoma in situ in breast cancer screening, Yen, Tabar, Smith, Duffy et al. , an estimate based on data taken from the Two-County trial, the UK, the Netherlands, Australia and the USA)
⇩2005 - 1% overdiagnosis rate (Overdiagnosis and overtreatment of breast cancer: estimates of overdiagnosis from two trials of mammographic screening for breast cancer, Duffy, Tabar et al., for the Two-County and Göteborg trials)
⇩2006 - 4.6% overdiagnosis rate (Estimate of overdiagnosis of breast cancer due to mammography after adjustment for lead time, Paci et al., based on data from screening programs in Northern and Central Italy, 36% overdiagnosis rate before adjustment for lead time)
⇩2006 - 5% overdiagnosis rate (Overdiagnosis, sojourn time, and sensitivity in the Copenhagen mammography screening program, Olsen AH, Duffy et al.)
⇩2006 - 3% overdiagnosis rate (Overdiagnosis and overtreatment of breast cancer: microsimulation modelling estimates based on observed screen and clinical data, deKoning et al., based on Dutch screening program data)
These are the USPSTF reference studies with more substantial rates of overdiagnosis:
⇧2004 - without screening, 1/3 of all invasive breast cancers in the age group 50-69 years would not have been detected in the patients' lifetime (Incidence of breast cancer in Norway and Sweden during introduction of nationwide screening:prospective cohort study, Zahl et al., observational study based on data on 1.4 million Norwegian and 2.9 million Swedish women)
⇧2005 - 3%-31% overdiagnosis rate (Overdiagnosis and overtreatment of breast cancer: overdiagnosis in randomised controlled trials of breast cancer screening, Moss)
⇧2006 - analysis of data from the 15-year follow-up on Malmo random controlled trial (RCT) comes up with 10% overdiagnosis rate (Rate of over-diagnosis of breast cancer 15 years after end of Malmö mammographic screening trial: follow-up study, Zackrisson et al.); that same year, Welch showed that correcting the estimate for dilution (systematic screening within control group) raises the estimate to 18% (How much overdiagnosis?), and Gøtzsche and Jørgensen showed that further correcting for opportunistic screening within control group raises the rate to 25% (Estimate of harm/benefit ratio of mammography screening was five times too optimistic)
These were plainly omitted from the report:
⇧2005 - 54% and 21% overdiagnosis rate for 50-59y and 60-69y age group, respectively, adjusted for lead time (Increased incidence of invasive breast cancer after the introduction of service screening with mammography in Sweden, Jonsson et al.)
⇧2006 - 36% differential between the early-stages-detected BC increase (1.52 ratio) and later-stages-detected BC (1.16) after introduction of screening, indicating ~36% overdiagnosis rate (Assessing the impact of screening mammography: Breast cancer incidence and mortality rates in Connecticut (1943-2002), Anderson et al.)
⇧2006 - systematic analysis of seven BC randomized controlled trials (RCT), estimated 30% overdiagnosis/overtreatment (Screening for breast cancer with mammography. Cochrane Database Syst Rev, Gøtzsche and Nielsen, Nordic Cochrane Center)
⇧2009 - updated systematic analysis of eight RCT, estimated 30% overdiagnosis/overtreatment (Gøtzsche and Nielsen, Nordic Cochrane Center, published 6 months prior to publication of the USPSTF report)
⇧2009 - averaged 52% overdiagnosis rate for screened vs. not screened populations in the United Kingdom; Manitoba, Canada; New South Wales, Australia; Sweden; and parts of Norway (Overdiagnosis in publicly organised mammography screening programmes: systematic review of incidence trends, Jørgensen and Gøtzsche, for 7-year period before and after introduction of screening, assuming 10% carcinoma in city share for the four regions - other than Manitoba - that did not include in situ diagnoses; published four months before USPSTF report).
And these came after the USPSTF report:
⇧2009 - 31-51% (depending on the statistical method) overdiagnosis rate for invasive breast cancer (Estimates of overdiagnosis of invasive breast cancer associated with screening mammography, Morrell et al., based on data for women in New South Wales, Australia)
⇧2010 - up to 47% for invasive BC (Breast cancer incidence and overdiagnosis in Catalonia, Martinez-Alonso et al.)
⇧2011 - updated systematic analysis of eight RCT, estimated 30% overdiagnosis/overtreatment (Gøtzsche and Nielsen, Nordic Cochrane Center)
⇧2012 - 15-25% overdiagnosis rate for invasive BC (Overdiagnosis of Invasive Breast Cancer Due to Mammography Screening: Results From the Norwegian Screening Program, Kalager et al.)
The 2009 USPSTF report, of course, couldn't have used these last four studies (Morell et al. was published the same month); they are included to show most recent developments. It could, however, use the two 2009 Nordic Cochrane Center studies, one published six months earlier (which was merely a confirmation of the rate from their 2006 systematic review), and the other four months prior to its publication (USPSTF report almost certainly got expedited review and publication by the Annals of Internal Medicine journal, in which case the publication lag time is typically within a month).
Some of the ignored studies point specifically to the probable
origins of overdiagnosis:
The findings were further confirmed by
the observation of a disappearance in the screened population of the
notch in the increasing trend of age-specific breast cancer
incidence for the ages after menopause. This notch could indicate
hormone-related retardation in tumour growth around menopause. It
appears that many of these clinically insignificant, retarded
tumours are detected with screening mammography. (Jonsson et
al.) However, the disparity between the
dramatic rise in early-stage tumors compared to the more modest
declines in late-stage disease and mortality suggests that many
mammography-derived early-stage lesions may never progress to
late-stage cancers and pose a threat to life. (Anderson at al.) Of course, USPSTF could have included some more
of the studies with negligible overdiagnosis estimates as well. Between
2003 and 2006, Duffy alone co-authored five papers, all finding
overdiagnosis rate negligibly low. Ann-Helen Olsen was similarly
productive, with one of her studies denying that overdiagnosis even exists (Breast
cancer incidence after the start of mammography screening in Denmark,
2003). Perhaps, crowding USPSTF report with this type of studies would
have been counterproductive (too obvious), so it used just enough of
"studies" produced to deny significance of overdiagnosis in
mammography to construct the desired view of its low-to-negligible
The findings were further confirmed by the observation of a disappearance in the screened population of the notch in the increasing trend of age-specific breast cancer incidence for the ages after menopause. This notch could indicate hormone-related retardation in tumour growth around menopause. It appears that many of these clinically insignificant, retarded tumours are detected with screening mammography. (Jonsson et al.)
However, the disparity between the dramatic rise in early-stage tumors compared to the more modest declines in late-stage disease and mortality suggests that many mammography-derived early-stage lesions may never progress to late-stage cancers and pose a threat to life. (Anderson at al.)
Of course, USPSTF could have included some more of the studies with negligible overdiagnosis estimates as well. Between 2003 and 2006, Duffy alone co-authored five papers, all finding overdiagnosis rate negligibly low. Ann-Helen Olsen was similarly productive, with one of her studies denying that overdiagnosis even exists (Breast cancer incidence after the start of mammography screening in Denmark, 2003). Perhaps, crowding USPSTF report with this type of studies would have been counterproductive (too obvious), so it used just enough of "studies" produced to deny significance of overdiagnosis in mammography to construct the desired view of its low-to-negligible risk.
What could have made the USPSTF report authors tell us that overdiagnosis rate is in most sources between 1% and 10% - and, indeed, two out of three times put it plainly as 1% to 10% - despite their own reference studies clearly indicating that it could be
up to several times higher than their highest estimate?
It is enlightening with respect to the environment in which is all this taking place, to mention that the Nordic Cochrane Center did found years before their 2006 report, back in 2001, that the available RCT data do indicate 30% overdiagnosis rate. The Danes were very surprised that this significant piece of information was not reported.
In a little while, they weren't so surprised anymore. Usually liberal editors of the Cochrane Breast Cancer Editors Group
did not allow them to publish this factual finding -
so powerful was opposition of the mighty global pro-mammography-screening camp to make this big negative of mammography screening publicly known.
It took another five years of wrestling with the opposition, involving, according to the authors, Cochrane ombudsman, for this information to break out of the censorship (It is time for a new paradigm for overdiagnosis with screening mammography, Jørgensen and Gøtzsche, 2009).
Is it due to such censorship wall, and various forms of pressure, why USPSTF - and other professional and governmental institutions that should relay correct information about mammography screening - failed to actually deliver it? Very likely. With this in mind, it is less surprising that the 2010 American Cancer Society (one of the first and staunchest promoters of public screening) guidelines
don't even mention overdiagnosis as screening negative.
That, of course, makes it easier to "solve" the related to it overtreatment problem as well.
How is it possible that such a wide discrepancy in estimating overdiagnosis rate exists, in the first place? What is the main difference between the group of studies that came up with negligible overdiagnosis rate, and those others?
It is that all those finding that overdiagnosis rate is negligible do not focus on actual data; rather, they use them to construct statistical models
using most often unverified and unverifiable assumptions - such as arbitrary values for "expected" rates outside of the time period covered by the data, lead time, screening parameters, and others - in order to obtain desired figures.
Or they use non-transparent methods, not allowing the reader to see where the results they present are coming from, or how they were obtained.
For instance, the 5% overdiagnosis for carcinoma in-situ (CIS) in Yen, Tabar, Smith, Duffy et al., is absurd, since it is common knowledge that less than 1 in 3 of the diagnosed carcinomas of this type progress to invasive BC. And, since about 25% of all screen-detected BC, according to the USPSTF (somewhat more for the 40-49y age group, somewhat less for 70+y group), are CIS, this figure does matter in the total overdiagnosis rate figure.
But the "Markov process model" used in this study, fed with appropriate assumptions, did produce the desired figure, also giving to it a fake "scientific" aura. And so did "Markov Chain Monte Carlo methods" in Duffy, Tabar et al., "microsimulation modelling" in deKoning et al., and similar statistical manipulations in literally all of the studies from this group of "researchers".
Or, let's look at how Paci et al. "corrects" the 36% excess BC diagnoses rate in the screened population down to 4.6% by adjusting it for lead time. They follow populations six years before, and five years after screening, and argue that the rate of excess diagnoses should be corrected because earlier diagnosis due to screening effectively shifts screen-detected BCs ahead of clinically detected ones in time and, in any limited time frame, packs in more BC diagnoses with than without screening. So they make certain, unspecified assumptions, calculate unspecified lead time, and present the resulting plots corrected for it, which nearly eliminate excess diagnoses in the screened populations.
Besides its method being very non-transparent, the study has another "little" problem. If overdiagnosis rate was as low as they say, plots that initially show excess in BC diagnoses for screened population would fall back to just above the level of unscreened population after a time period approximately equaling lead time. But it never happened. The actual data
show either none, or no significant compensatory drop in the excess BC diagnoses among screened populations
within periods of up to 10 years. Since the lag time ranges between one and three years, or so, compensatory drop that would nearly cancel out the initial jump in the rate of BC diagnoses, would have to be clearly evident in the real data.
The fact that it is not there means that
most of the excess diagnoses in the screened population are pseudo-disease (overdiagnosis),
and that what Paci et al. present as "research study" is in irreparable conflict with the actual data. Most kindly, it can be labeled as "statistical fiction".
In fact, none of the single-digit overdiagnosis rate figure clears this direct empirical test - they are simply contradicting the reality. And the reality shows in the actual BC incidence data over extended periods of time. Here's what it looks like for several countries, from Australia to Canada.
Note that the specific regions are New South Wales in Australia, Manitoba in Canada, and Akershus, Oslo, Rogaland, and Hordaland counties in Norway.
These windows into the real-life data show consistent pattern of BC diagnosis incidence rapidly raising with the screening rate in the 50-69y age group, approaching or even exceeding those in the 70+y group (which is generally out of screening programs) rate, without falling down near the level expected w/o screening (dotted line w/arrow).
Compensatory drop is seen as the drop of observed rate vs. expected rate in the post-screening age group (70+). It occurred in Canada, Sweden and Norway, but it is significantly smaller than the rate increase in the screening age group (50-69y). On the other hand, Australia had rate increase in the post-screening age group as well, which may be related to the rate increase in the pre-screening (40-49y) age group (perhaps relatively significant number of women from both groups did undergo opportunistic screening).
Likewise, compensatory drop in the post-screening age group in Canada may have been a result of factors not related to screening, which also caused rate drop in the pre-screening age group.
Unlike "model-constructing" studies, this study also includes data for the 40-49y age group. Even if not invited, the BC diagnosis incidence in this group is very important for understanding the overall picture, because it reflects the presence and effects of the incidence factors other than screening (increase in BC awareness, use of preventive measures, changes in diet and lifestyle, and alike).
A brief summary of the results is given in table
The overall rate of overdiagnosis is estimated at 52% (assuming 10% of diagnosed cancers being CIS where the data was for invasive BC only, a rather conservative rate), and 35% for invasive BC only. That implies that 1 in 3 of all diagnosed BC, and 1 in 4 of diagnosed invasive BC were pseudo disease, which would never progress to a malignancy.
The four studies that came out after the USPSTF report add more support to already solidly established view that screening result in unacceptably high rate of overdiagnosis. And it is probably worse in the U.S., considering that screening sensitivity and the rate of false positives are significantly higher in the U.S. than in Western Europe overall, and in most of the Western world, due to better qualified radiologists, as well as better procedures and standards in the latter.
Results of one of these studies, finding near-zero overdiagnosis rate for women that entered screening program in their 60-ies - sharply lower than the 30%-40% rate for those that entered in their 50-ies - indicate that
most of pseudo-cancerous growths develop during the 6th decade of woman's life, and regress to undetectable or vanish by the 7th decade.
That fairly well fits into the pattern outlined by available data: women in their 50ies are at the greatest risk of being diagnosed by breast malignancy without actually having it. The risk is still significant for those in their 40ies, even younger - as many as 22 out of 110 consecutive autopsies of women averaging 39 years of age showed abnormal growth that would have been diagnosed as breast cancer, 20 of them being carcinoma in situ (Breast cancer and atypia among young and middle-aged women: a study of 110 medicolegal autopsies, Nielsen et al. 1987).
Since the chances for CIS to turn malignant are less than 30%, on the average, as many as 14, or so, of them could have been diagnosed with breast cancer through screening, despite their tissue change being biologically benign, or never progressing to the symptomatic malignancy. That is about 13% incidence,
higher than the lifetime risk of developing breast cancer.
One of the main arguments for mammography screening was that earlier detection will expose women to less invasive treatment. The reality is, however, different. Higher detection sensitivity resulted in more of invasive treatment - biopsies, mastectomies, lumpectomies, radiation - rather than less, and most of it was not needed.
Abnormal growths detected by mammography and diagnosed as breast cancer are all treated generally the same, as if they are real cancers. Thus women with pseudo-disease are treated unnecessarily, which is not only inconvenient and costly, it may - and often does - result in trauma, permanent psychological scars and disfigurement. It may also expose these women to unnecessary-treatment-related risks, such as radiation.
There is relatively little research with respect to overtreatment due to mammography screening. The American Cancer Society 2010 guidelines don't even mention overdiagnosis, so there is no overtreatment problem as well. The USPSTF 2009 report doesn't even contain a word "overtreatment". Needless to say, screening proponents
prefer that women don't know about it.
Still, the data we have is more than compelling;
● introduction of screening in Southeast Netherlands was followed by 71% and 84% increase in breast-preserving surgery and mastectomy, respectively, for invasive cancer only (Trends in breast-conserving surgery in the Southeast Netherlands, Gøtzsche 2002)
● CIS treated by mastectomy in the U.S. declined from 71% in 1983 to 40% in 1993 but, due to the surge in the number of CIS diagnosed, the number of mastectomies almost tripled (Increases in ductal carcinoma in situ (DCIS) of the breast in relation to mammography: a dilemma, Ernster and Barclay)
● from 1990 to 2001, number of mastectomies in the U.K. increased by 36% for invasive BC, and more than quadrupled for CIS (Mass breast screening: is there a hidden cost?, Douek and Baum 2003)
● analysis of available data for randomized controlled trials gives 31% more lumpectomies for the two adequate trials (Canada 1/2, Malmo), and 42% more for two inadequate ones (Stockholm, Koppaberg), in the screened population; 20% and 21% more mastectomies, respectively (same trials), 24% more radiotherapy for the screened population in Malmo, and 40% more in Koppaberg, and 37% less chemo in Malmo and 6% more in Koppaberg, also for the screened population (Screening for breast cancer with mammography, Gøtzsche and Nielsen, 2006/2009)
The trial data is very incomplete, but does indicate significantly higher overall rate of invasive treatment in the screened population. Considering that most of these trials were in Sweden, with significantly lower detection rate than nowadays in the U.S., it is to expect that the U.S. overtreatment rates in the screened vs. unscreened population are even higher.
To a first approximation, the rate of overtreatment should be somewhat higher than that of overdiagnosis. And, that is, somewhat higher than about 1 in 4 diagnosed invasive breast cancer cases, and 1 in 3 diagnosed BC including CIS.
Mammography screening risks related to its inaccuracy - false negatives, false positives, overdiagnosis and mainly related to it overtreatment - are its most significant negatives. How much of a danger is radiation dose delivered during mammography screening? This question was around from the very beginning of screening mammography. Yet, it didn't seem to get a clear answer. Following pages takes attempt at that.