Explaining Classifiers’ Outputs with Causal Models and Argumentation

A Rago, F Russo, E Albini, F Toni, P Baroni

Research output: Contribution to journalArticlepeer-review

Abstract

We introduce a conceptualisation for generating argumentation frameworks (AFs) from causal models for the purpose of forging explanations for mod-els’ outputs. The conceptualisation is based on reinterpreting properties of semantics of AFs as explanation moulds, which are means for characterising argumentative relations. We demonstrate our methodology by reinterpreting the property of bi-variate reinforcement in bipolar AFs, showing how the ex-tracted bipolar AFs may be used as relation-based explanations for the outputs of causal models. We then evaluate our method empirically when the causal models represent (Bayesian and neural network) machine learning models for classification. The results show advantages over a popular approach from the literature, both in highlighting specific relationships between feature and classification variables and in generating counterfactual explanations with respect to a commonly used metric.
Original languageEnglish
Pages (from-to)421-449
Number of pages0
JournalJournal of Applied Logics
Volume10
Issue number3
Publication statusPublished - 1 May 2023

Fingerprint

Dive into the research topics of 'Explaining Classifiers’ Outputs with Causal Models and Argumentation'. Together they form a unique fingerprint.

Cite this