TY - JOUR
T1 - Exploring the relationship between cancer incidence and the sustainable development goals through complex networks and machine learning
AU - Lo sasso, Andrea
AU - Bellantuono, Loredana
AU - Omodei, Elisa
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/10/28
Y1 - 2025/10/28
N2 - The awareness that socioeconomic factors play a significant role in the potential onset of cancer is increasingly widespread; however, a clear understanding of the most influential factors is still lacking. In this study, we explore the relationship between cancer incidence, as recorded by the International Agency for Research on Cancer, and the environmental and socioeconomic well-being of countries, measured by the Sustainable Development Goals (SDGs) indicators. To identify relevant predictors of cancer incidence, we construct a weighted complex network where nodes represent SDG indicators, and links correspond to statistically significant correlations between them. We implement community detection to identify a subset of indicators that incorporates the non-redundant dataset’s information, and use the selected features for a machine learning prediction of cancer incidence rates. Furthermore, we highlight the most influential SDG indicators by means of an eXplainable Artificial Intelligence analysis. We find that not only health-related indicators play a key role in explaining cancer incidence, but also factors related to agriculture, resource availability, and water cleanliness. These findings provide insights into the complex interplay between socioeconomic, environmental, and health factors. This study aims to expand knowledge on non-intuitive associations related to cancer onset and may contribute to the development of effective public prevention policies.
AB - The awareness that socioeconomic factors play a significant role in the potential onset of cancer is increasingly widespread; however, a clear understanding of the most influential factors is still lacking. In this study, we explore the relationship between cancer incidence, as recorded by the International Agency for Research on Cancer, and the environmental and socioeconomic well-being of countries, measured by the Sustainable Development Goals (SDGs) indicators. To identify relevant predictors of cancer incidence, we construct a weighted complex network where nodes represent SDG indicators, and links correspond to statistically significant correlations between them. We implement community detection to identify a subset of indicators that incorporates the non-redundant dataset’s information, and use the selected features for a machine learning prediction of cancer incidence rates. Furthermore, we highlight the most influential SDG indicators by means of an eXplainable Artificial Intelligence analysis. We find that not only health-related indicators play a key role in explaining cancer incidence, but also factors related to agriculture, resource availability, and water cleanliness. These findings provide insights into the complex interplay between socioeconomic, environmental, and health factors. This study aims to expand knowledge on non-intuitive associations related to cancer onset and may contribute to the development of effective public prevention policies.
KW - Cancer incidence
KW - Machine learning
KW - Network science
KW - Sdg
KW - Xai
UR - https://www.scopus.com/pages/publications/105020917597
U2 - 10.1140/epjds/s13688-025-00590-6
DO - 10.1140/epjds/s13688-025-00590-6
M3 - Article
SN - 2193-1127
VL - 14
JO - EPJ Data Science
JF - EPJ Data Science
IS - 1
M1 - 76
ER -