TY - JOUR
T1 - A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs
AU - Cruz, Eduardo
AU - Villavicencio, Monica
AU - Vaca, Carmen
AU - Espín-Noboa, Lisette
AU - Verdezoto, Nervo
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/1/17
Y1 - 2025/1/17
N2 - Ending poverty in all its forms everywhere remains the number one Sustainable Development Goal of the United Nations 2030 Agenda. Governments face challenges in measuring socioeconomic status with fine spatial resolution because traditional data collection methods, such as censuses and surveys, are time-consuming, labor-intensive, performed at long intervals, and cover only a limited population. This work is a data-driven study to analyze the digital traces left by humans in supermarket transactions and model the relationship between consumption behavior and the average per capita income, proposing a proxy to estimate socioeconomic status at the urban neighborhood level. We analyze more than 20 million supermarket shopping transactions in Guayaquil, the most populated city in Ecuador. Using customer consumption data, we created a basket graph and fed it into a graph neural network to predict neighborhood socioeconomic status. The model was trained with spectral and spatial convolutional filters using cross-validation to select the best approach for the prediction. The results show that the Chebyshev spectral convolutional filter has the highest predictive power to predict the socioeconomic status of the neighborhood, with R2=0.91. Our proposed approach contributes to measuring socioeconomic status at the neighborhood level to support policymakers in making informed decisions about resource allocation according to the needs of different geographical areas.
AB - Ending poverty in all its forms everywhere remains the number one Sustainable Development Goal of the United Nations 2030 Agenda. Governments face challenges in measuring socioeconomic status with fine spatial resolution because traditional data collection methods, such as censuses and surveys, are time-consuming, labor-intensive, performed at long intervals, and cover only a limited population. This work is a data-driven study to analyze the digital traces left by humans in supermarket transactions and model the relationship between consumption behavior and the average per capita income, proposing a proxy to estimate socioeconomic status at the urban neighborhood level. We analyze more than 20 million supermarket shopping transactions in Guayaquil, the most populated city in Ecuador. Using customer consumption data, we created a basket graph and fed it into a graph neural network to predict neighborhood socioeconomic status. The model was trained with spectral and spatial convolutional filters using cross-validation to select the best approach for the prediction. The results show that the Chebyshev spectral convolutional filter has the highest predictive power to predict the socioeconomic status of the neighborhood, with R2=0.91. Our proposed approach contributes to measuring socioeconomic status at the neighborhood level to support policymakers in making informed decisions about resource allocation according to the needs of different geographical areas.
KW - Basket graph
KW - Graph neural network
KW - Item embedding
KW - Neighborhood socioeconomic status
KW - Per capita income
KW - Spectral convolutional filter
UR - http://www.scopus.com/inward/record.url?scp=85217569391&partnerID=8YFLogxK
U2 - 10.1140/epjds/s13688-024-00515-9
DO - 10.1140/epjds/s13688-024-00515-9
M3 - Article
AN - SCOPUS:85217569391
SN - 2193-1127
VL - 14
JO - EPJ Data Science
JF - EPJ Data Science
IS - 1
M1 - 3
ER -