TY - JOUR
T1 - Origins of Algorithmic Instabilities in Crowdsourced Ranking
AU - Burghardt, Keith
AU - Hogg, Tad
AU - D'Souza, Raissa
AU - Lerman, Kristina
AU - Posfai, Marton
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/10/14
Y1 - 2020/10/14
N2 - Crowdsourcing systems aggregate decisions of many people to help users quickly identify high-quality options, such as the best answers to questions or interesting news stories. A long-standing issue in crowdsourcing is how option quality and human judgement heuristics interact to affect collective outcomes, such as the perceived popularity of options. We address this limitation by conducting a controlled experiment where subjects choose between two ranked options whose quality can be independently varied. We use this data to construct a model that quantifies how judgement heuristics and option quality combine when deciding between two options. The model reveals popularity-ranking can be unstable: unless the quality difference between the two options is sufficiently high, the higher quality option is not guaranteed to be eventually ranked on top. To rectify this instability, we create an algorithm that accounts for judgement heuristics to infer the best option and rank it first. This algorithm is guaranteed to be optimal if data matches the model. When the data does not match the model, however, simulations show that in practice this algorithm performs better or at least as well as popularity-based and recency-based ranking for any two-choice question. Our work suggests that algorithms relying on inference of mathematical models of user behavior can substantially improve outcomes in crowdsourcing systems.
AB - Crowdsourcing systems aggregate decisions of many people to help users quickly identify high-quality options, such as the best answers to questions or interesting news stories. A long-standing issue in crowdsourcing is how option quality and human judgement heuristics interact to affect collective outcomes, such as the perceived popularity of options. We address this limitation by conducting a controlled experiment where subjects choose between two ranked options whose quality can be independently varied. We use this data to construct a model that quantifies how judgement heuristics and option quality combine when deciding between two options. The model reveals popularity-ranking can be unstable: unless the quality difference between the two options is sufficiently high, the higher quality option is not guaranteed to be eventually ranked on top. To rectify this instability, we create an algorithm that accounts for judgement heuristics to infer the best option and rank it first. This algorithm is guaranteed to be optimal if data matches the model. When the data does not match the model, however, simulations show that in practice this algorithm performs better or at least as well as popularity-based and recency-based ranking for any two-choice question. Our work suggests that algorithms relying on inference of mathematical models of user behavior can substantially improve outcomes in crowdsourcing systems.
KW - algorithmic bias
KW - algorithmic instability
KW - crowd-wisdom
KW - decision-formation
UR - http://www.scopus.com/inward/record.url?scp=85094199709&partnerID=8YFLogxK
U2 - 10.1145/3415237
DO - 10.1145/3415237
M3 - Article
AN - SCOPUS:85094199709
SN - 2573-0142
VL - 4
JO - Proceedings of the ACM on Human-Computer Interaction
JF - Proceedings of the ACM on Human-Computer Interaction
IS - CSCW2
M1 - 166
ER -