Data: A New Weapon Against Breast Cancer

Data: A New Weapon Against Breast Cancer

Pictured above: Industrial engineering professor Shengfan Zhang is using data mining and statistical modeling to end conflicting standards and find the best approach to cancer screening. | Photo by Matt Reynolds


Breast cancer trails only melanoma as the most common cancer, and it has one of the highest cancer-related death rates. Early detection and proper treatment are proven to increase the chances of survival, yet experts disagree on age and frequency standards for cancer screening.

That conflicting guidance can be confusing for patients — and could lead to to some women skipping screening altogether.

Industrial engineering professor Shengfan Zhang is using data mining along with statistical and computational modeling techniques to solve this puzzle. By plugging survey data into formulas, Zhang is able to identify the best general approach to breast cancer screening. Her ultimate goal is more ambitious: To create a personalized system of breast cancer screening, one in which a doctor and patient sit down together and work out a customized plan based on the patient’s risk factors, preferences and the approach the data say works best.

“Right now, there’s only one protocol for everyone,” she said. “But we know that this doesn’t work. One size does not fit all. A physician should work with a patient to come up with the best option.”

The Problem with Mammograms

The importance of screening is found in the numbers. One in seven women will develop invasive breast cancer in her lifetime. Currently, the only effective way to screen for breast cancer is to get a mammogram, but there are several factors that make mammograms less than perfect. And the questions of when and how often are still being debated.

Frequency matters. Mammograms expose patients to X-ray radiation, which is in itself a risk factor for cancer. Zhang cites research concluding that each exposure to radiation increases a patient’s cancer risk by about 1 percent, and the cumulative increase in risk over a decade of screening is 5 percent for each breast.

Mammograms don’t give clear results; they produce images that must be interpreted by clinicians. Sometimes the interpreter can miss a malignancy, resulting in something called a false negative. The danger of this is obvious — the treatment that is so vital to a woman’s survival does not happen. Another danger is called interval cancer, or cancers that appear between screenings and grow undetected for months before another test is administered.

Less obvious are the risks associated with false positives and overdiagnosis. In the case of a false positive, the woman is told she has a possible tumor that turns out to be nothing. Overdiagnosis occurs when cancer is detected and treated, but the treatment is unnecessary, because the cancer would not have been a risk to the woman’s health. In both these cases, the patient undergoes treatments that cause stress and physical side effects and don’t result in any benefit.

For these reasons, experts disagree on when and how often women should get mammograms. The American Cancer Society recommends yearly screening starting at age 40. On the other hand, the U.S. Preventative Services Task Force recommends screening every two years starting at age 50.

The Question

Zhang and her graduate student, Mahboubeh Madadi, set out to identify which screening policy had the best balance of risks and benefits, but they also had to consider two other factors: Age and adherence to screening recommendations.

Age presents a paradox when it comes to cancer screening. While the risks of developing breast cancer increase with a patient’s age, breast cancers are less aggressive in older women and tumors are more responsive to treatment. Because of this, survival rates for breast cancer actually increase with age. In addition, the accuracy of mammograms also increases with age, because tissue in the breasts becomes less dense.

Zhang and Madadi also considered the degree to which women were likely to follow the policy. While the American Cancer Society and the Preventative Services Task Force crafted their recommendations under the assumption that women would faithfully follow them, the reality is that most women do not. According to data from the Centers for Disease Control and Prevention, more than 75 percent of women older than 40 had five or fewer mammograms between 1996 and 2009 — considerably below the screening level recommended by the American Cancer Society.

Zhang and Madadi set out to answer a complicated question: Which screening policy would work best overall and for individual women, given that women skip screenings, the risks associated with mammography and the age-related variables in both breast cancer incidence and risk?

First, the researchers had to establish what they mean by “work best.” They decided to measure outcomes in two ways. They considered how a screening policy affected a patient’s chance of dying from breast cancer. They also examined how the policy would affect quality of life by measuring a patient’s remaining quality-adjusted life years.


The Process

Industrial engineers use mathematical tools to analyze data and optimize processes. Zhang and Madadi used several different statistical and operation-research methods for their project.

In an earlier study, they had predicted how adherent different groups of women would be to a screening policy. They considered many different factors, including age, race, education, insurance coverage, family history, body mass index, eating habits, exercise habits and the subjects’ overall knowledge of breast cancer and mammography. The researchers determined the rate of adherence for women in different circumstances based on data collected by the Health Information National Trends Survey, which asked women if they planned to get a mammogram.

With this data, Zhang and Madadi predicted the likelihood that the general population would follow a policy.

Next, they used a method called the partially observable Markov decision process to evaluate different policies for three different cases: The general population, an individual patient and an imaginary population that would follow each policy to the letter.


Mathematic models drive cancer research.

The Markov decision process is a tool that models decision making in situations where the outcomes are results of both a decision and random chance. It shows relationships between different states and the actions that lead from one state to another. The type of Markov decision process used by Zhang and Madadi is called partially observable because the true health state of a patient is not fully discernable — a woman may have cancer but this is not known with certainty until the cancer is detected. In this case, the researchers had to consider three possible states at the same time: A woman could have early stage cancer, advanced stage cancer or she could be cancer free. At any point in the process, a patient’s state was a combination of the probability she had cancer and the probability she didn’t.

“The partially observable Markov decision process is used in many different areas, from robot planning and control to infrastructure maintenance problems,” Zhang said. “It is natural to apply this type of modeling to health care because of the uncertainty in disease and human behavior and the partially observable nature of the patient’s true condition.”

When a woman is due for a screening, she has the choice to undergo screening or skip the procedure. The decision she makes can lead to several different consequences. She could continue to have no symptoms, she could detect symptoms before her next screening, she could get a false negative result and then develop symptoms, or she could get a false positive result, which would lead to further testing that shows no sign of cancer.

If a patient gets a screening with true positive results, she moves out of the screening decision process and into treatment. By modeling the interaction between these states and actions for different groups of women, the researchers developed a realistic idea of which policy would lead to lower mortality and the best quality of life.

The results of their research suggest that adherence plays a big part in how well a screening policy works. Zhang and Madadi found that on average, women who follow screening policy recommendations have higher quality of life and lower mortality risk. But they found a big difference between the imaginary group that followed the policy exactly and the more realistic group that sometimes skipped screenings. For the perfect group, the U.S. Preventative Services Task Force policy resulted in the higher quality-adjusted life years, suggesting that having a mammogram every other year is a good way to balance the benefits and risks
of screening.

However, the realistic group and the individual cases fared better on the American Cancer Society policy, which recommends screening every year. Zhang explained that according to these results, screening policies should be more like speed limits, which are often set low with the expectation that many drivers will exceed them. Perhaps policy makers should account for the fact that patients will not follow their guidelines and set them a higher than necessary to make up for that.

The Next Steps

Zhang plans to refine and build on this research in the future. She wants to revise her formulas to account for the fact that negative experiences with screening — like a false positive — could affect the way the patient follows a screening regimen in the future. She would also like to incorporate patients’ attitudes toward risk into her model. Risk-averse patients would be less likely to deviate from a policy, while risk-seeking individuals might be more comfortable taking chances with their screening schedules.

Zhang is currently working with Madadi to address overdiagnosis, which is considered by some to be the most important disadvantage of cancer screening. The goal of their research is to refine the screening program to reduce incidents of overdiagnosis. In addition, another graduate student, Fan Wang, is looking at a decision process that accounts for the different risks associated with false positive and false negative results.

“I am interested in conducting research that will have a significant impact,” explained Zhang, whose personal experience with family health issues motivated her to focus on health-care research. “In the field of medical decision-making, I can use my expertise to help patients and physicians make efficient and effective decisions that will lead to better health. In addition, I believe data-driven decision modeling can provide insights to health-care policy makers.”


Shengfan Zhang, Assistant Professor, Department of Industrial Engineering, College of Engineering

Shengfan Zhang, Assistant Professor, Department of Industrial Engineering, College of Engineering












About The Author

Camilla Shumaker is the director of science and research communications. She writes about physics, chemistry, political science and other topics. Camilla can be reached at or (479) 575-7422.

University Relations Science and Research Team

University Relations Science and Research Team

Matt McGowan
science and research writer

Robert Whitby
science and research writer

Looking for an expert?

The University of Arkansas Campus Experts website is a searchable database of experts who can talk to the media on current events.

Trending Topics:
State and local economy
Environmental economics
Immigration politics

More on University of Arkansas Research

Visit the office of Research & Innovation for a complete list research awards and more information on research policies, support and analytics.

Connect with Us