TY - GEN
T1 - Relevance-based Infilling for Natural Language Counterfactuals
AU - Betti, Lorenzo
AU - Abrate, Carlo
AU - Bonchi, Francesco
AU - Kaltenbrunner, Andreas
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 979-8-4007-0124-5/23/10...$15.00.
PY - 2023/10/21
Y1 - 2023/10/21
N2 - Counterfactual explanations are a natural way for humans to gain understanding and trust in the outcomes of complex machine learning algorithms. In the context of natural language processing, generating counterfactuals is particularly challenging as it requires the generated text to be fluent, grammatically correct, and meaningful. In this study, we improve the current state of the art for the generation of such counterfactual explanations for text classifiers. Our approach, named RELITC (Relevance-based Infilling for Textual Counterfactuals), builds on the idea of masking a fraction of text tokens based on their importance in a given prediction task and employs a novel strategy, based on the entropy of their associated probability distributions, to determine the infilling order of these tokens. Our method uses less time than competing methods to generate counterfactuals that require less changes, are closer to the original text and preserve its content better, while being competitive in terms of fluency. We demonstrate the effectiveness of the method on four different datasets and show the quality of its outcomes in a comparison with human generated counterfactuals.
AB - Counterfactual explanations are a natural way for humans to gain understanding and trust in the outcomes of complex machine learning algorithms. In the context of natural language processing, generating counterfactuals is particularly challenging as it requires the generated text to be fluent, grammatically correct, and meaningful. In this study, we improve the current state of the art for the generation of such counterfactual explanations for text classifiers. Our approach, named RELITC (Relevance-based Infilling for Textual Counterfactuals), builds on the idea of masking a fraction of text tokens based on their importance in a given prediction task and employs a novel strategy, based on the entropy of their associated probability distributions, to determine the infilling order of these tokens. Our method uses less time than competing methods to generate counterfactuals that require less changes, are closer to the original text and preserve its content better, while being competitive in terms of fluency. We demonstrate the effectiveness of the method on four different datasets and show the quality of its outcomes in a comparison with human generated counterfactuals.
KW - counterfactuals
KW - explainability
KW - masked language model
KW - NLP
UR - https://www.scopus.com/pages/publications/85178117566
U2 - 10.1145/3583780.3615029
DO - 10.1145/3583780.3615029
M3 - Conference contribution
AN - SCOPUS:85178117566
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 88
EP - 98
BT - CIKM 2023 - Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023
Y2 - 21 October 2023 through 25 October 2023
ER -