Primary Care Corner with Geoffrey Modest MD: Radiologist Variability in Mammography Readings

By Dr. Geoffrey Modest

A recent article revealed the dramatic variability in radiologists’ interpretations of mammographic breast density (see Sprague BL. Ann Intern Med 2016; 165: 457). Determining breast density accurately is certainly important because increased breast density leads to difficulty in reading mammograms and is an independent risk factor for breast cancer. In this light, one prerequisite for us in primary care is that the radiologic determination of breast density is consistent and accurate. But, details, from an NIH supported study:

  • Data from 216,783 screening mammograms from 145,123 women aged 40 to 89 were included, from 30 radiology facilities within three breast cancer screening research centers of the Population-based Research Optimizing Screening through Personalized Regimens (PROSPR) consortium.
  • 83 radiologists were involved; each interpreted at least 500 screening mammograms from 2011-3, using the BIRADS reporting system, along with patients age, race, and BMI


  • 9% of mammograms were rated as showing dense breasts
  • Across radiologists, the finding of dense breasts ranged from 6.3% to 84.5% (median 38.7%, interquartile range 28.9% to 50.9%). !!!!
  • Variation in breast density assessment was pervasive in all but the most extreme patient age and BMI combinations
  • Among women with consecutive mammograms interpreted by different radiologists, 17.2% had discordant assessments of breast density.


  • One of the scariest issues to me as a clinician is that I need to rely on an accurate interpretation of medical tests in order to inform my patient management. The sheer magnitude of the variation in breast density assessment is quite striking.
  • There are also other studies, mostly 10-20 years old, showing that the general radiologic interpretation of mammograms has considerable variability as well.
  • There are certainly other tests that have significant variability andhighlight this issue more broadly — for example finding significant spine MRI abnormalities in totally asymptomatic patients:
    • One study (see Jensen MC. N Engl J Med 1994; 331: 69) looked at 98 people without back pain, where their MRI scans were interpreted by two experienced neuroradiologists at the Cleveland Clinic, finding that 52% had a bulge in at least one intervertebral disc, 27% had a protrusion, and 1% had an extrusion. 38% had multilevel abnormalities. Only 36% had a normal MRI.
    • A systematic review (see Brinjikji W. AJNR2015 36: 811-816) found dramatic MRI or CT changes in asymptomatic people, which increased with age.  For example, the prevalence of disc degeneration went from 37% at age 20 to 96% at age 80, bulges went from 30% at age 20 to 84% at age 80, disc protrusion from 29% age 20 to 43% at age 80, annular fissure from 19% age 20 to 29% at age 80.  so, lots and lots of impressive disc changes even in asymptomatic 20 year olds……
  • Another issue, which we tend to understand more intuitively, is that of ultrasounds, which are clearly operator-dependent. But we had a patient with chronic hepatitis B, who had a “normal” screening RUQ ultrasound for hepatocellular cancer, but a CT revealed a 9cm cancer!! I spoke with a trusted hepatologist who commented that he used a CT to scan to screen for really high-risk patients because of the variability of ultrasounds (though that is not exactly a clear-cut, or generally accepted algorithm….)
  • One major concern about over-reading breast density (as well as potentially scaring patients that they might be at higher breast cancer risk) is that this findingoften leads to further studies such as ultrasound, digital breast tomosynthesis, and MRI examination (though there is minimal evidence to support these tests, and they may well lead to unnecessary biopsies, more radiation exposure, etc. And the USPSTF formally gives these procedures an “I” rating, for insufficient evidence)
  • And, another issue:  half of the United States has legislation currently requiring disclosure of mammographic breast density, in some cases advising women to discuss supplemental screening tests with their providers if they have dense breasts (again without supportive medical evidence). And even theFDA is considering a legislative requirement to report breast density information to patients. I think there is a real concern about non-medical legislators enacting medical legislation, where legislators may be swayed by patients pleading for unproved treatments, perhaps with the support of an “expert witness”. Or, perhaps the legislature decides to require a certain treatment based on small or flawed studies, writes the treatment into law, but then new and better studies contradict this legislative imperative. One recent example is in Massachusetts, where a law was passed requiring insurance to cover long-term antibiotic therapy for chronic Lyme disease, though several studies, including a new one (see Berende . N Engl J Med2016;374:1209-1220), have not found benefit from long-term antibiotics. Or, in the past, there has been legislation supporting the availability of bone marrow transplants for women with breast cancer, but without any evidence of benefit (and pretty clear harm). I do realize that there have been egregious, inappropriate treatment denials by some health insurers in the past which has led to some of this legislation and public/medical community outrage. But legislating medical diagnostics and therapies is fraught…..
  • So, this inconsistency/unreliability in breast density interpretation may subject many people to potentially dangerous interventions. I think it is really important that we as clinicians understand that many procedures we order are subject to large variability, as above. So, what can we do??
    • Whenever possible, we should interpret these “objective” data in the context of the clinical situation of the patient, and not always reflexively respond to the test results (it is just another piece of data, such as from the history or physical, which should be put in the overall gestalt of what is going on with the patient). Of course, some of these objective findings, even unsuspected, may be very important and not dismissed (e.g., the incidental finding of early pancreatic or renal cancers).
    • Maybe we should consider getting second opinions more often than we currently do, to assess interobserver agreement
    • Perhaps there should be triggers in place for certain findings (such as dense breasts on mammogram), requiring a blinded read by another radiologist, or?? always having mammograms automatically re-read by another radiologist??, or having an automatic second-read whenever a radiologist comments “should be repeated in 3-6 months by another test”, which puts the medicolegal imperative on us in primary care to do yet another test with potentially more radiation exposure, cost, possible unnecessary procedures, etc.
    • Perhaps there needs to be much more transparency in the system overall, maybe requiring those reporting on these results to have regular standardized testing themselves and posting the results (sort of like requiring hospitals to report their C-section rates).
    • Perhaps we need a good computer program????
  • I realize this blog is more tangential than others, but i do think this issue of inconsistency in mammography reading does bring up a slew of general issues in clinical medicine…..

(Visited 1 times, 1 visits today)