TY - JOUR
T1 - Hidden citations obscure true impact in science
AU - Meng, Xiangyi
AU - Varol, Onur
AU - Barabási, Albert László
N1 - © The Author(s) 2024. Published by Oxford University Press on behalf of National Academy of Sciences.
PY - 2024/5
Y1 - 2024/5
N2 - References, the mechanism scientists rely on to signal previous knowledge, lately have turned into widely used and misused measures of scientific impact. Yet, when a discovery becomes common knowledge, citations suffer from obliteration by incorporation. This leads to the concept of hidden citation, representing a clear textual credit to a discovery without a reference to the publication embodying it. Here, we rely on unsupervised interpretable machine learning applied to the full text of each paper to systematically identify hidden citations. We find that for influential discoveries hidden citations outnumber citation counts, emerging regardless of publishing venue and discipline. We show that the prevalence of hidden citations is not driven by citation counts, but rather by the degree of the discourse on the topic within the text of the manuscripts, indicating that the more discussed is a discovery, the less visible it is to standard bibliometric analysis. Hidden citations indicate that bibliometric measures offer a limited perspective on quantifying the true impact of a discovery, raising the need to extract knowledge from the full text of the scientific corpus.
AB - References, the mechanism scientists rely on to signal previous knowledge, lately have turned into widely used and misused measures of scientific impact. Yet, when a discovery becomes common knowledge, citations suffer from obliteration by incorporation. This leads to the concept of hidden citation, representing a clear textual credit to a discovery without a reference to the publication embodying it. Here, we rely on unsupervised interpretable machine learning applied to the full text of each paper to systematically identify hidden citations. We find that for influential discoveries hidden citations outnumber citation counts, emerging regardless of publishing venue and discipline. We show that the prevalence of hidden citations is not driven by citation counts, but rather by the degree of the discourse on the topic within the text of the manuscripts, indicating that the more discussed is a discovery, the less visible it is to standard bibliometric analysis. Hidden citations indicate that bibliometric measures offer a limited perspective on quantifying the true impact of a discovery, raising the need to extract knowledge from the full text of the scientific corpus.
KW - catchphrase
KW - foundational paper
KW - hidden citation
KW - latent Dirichlet allocation
KW - science of science
UR - http://www.scopus.com/inward/record.url?scp=85193429442&partnerID=8YFLogxK
U2 - 10.1093/pnasnexus/pgae155
DO - 10.1093/pnasnexus/pgae155
M3 - Article
C2 - 38715726
AN - SCOPUS:85193429442
SN - 2752-6542
VL - 3
JO - PNAS Nexus
JF - PNAS Nexus
IS - 5
M1 - pgae155
ER -