An interventionist approach to explainable AI

Artificial Intelligence (AI) is increasingly used for decision-making in critical areas such as medical diagnosis, loan applications, and parole decisions. However, the opacity of many AI systems has led to serious harm, including discrimination against protected groups. For example, AI systems in the USA have been known to systematically deny mortgages to Black applicants, and AI systems used for hiring decisions have been shown to discriminate based on gender.

Without knowing how an AI system works, it can be difficult to identify and manage these problems. This research project aims to address this issue by developing a novel approach to explainable AI (XAI) based on interventionism – a philosophical theory of explanation designed to yield causal understanding.

The project will identify philosophical challenges in counterfactual explanations that apply to AI systems and determine how we can build XIA systems using interventionism to address these challenges. Evaluation of the effectiveness of interventionist approaches could provide a better basis for AI informed decision-making.

The research team will conduct user studies to assess perceptions of both counterfactual and interventionist explanations. Using a survey-based methodology on the Prolific platform, participants will rate AI system outputs based on helpfulness, trustworthiness, informativeness, and understandability.

MDAP’s expertise in compute infrastructure, data management, statistical analysis, study design and developing ethics applications will ensure that the approach to explainability we develop is conceptually sound, applicable to real AI systems and provides the right explanations to participants.

This project is expected to impact both philosophy and XAI fields, extending philosophical approaches to interventionism and producing a new approach to explainability that combines insights from philosophy and computer science. The research aims to improve AI decision-making, enabling individuals to better understand and challenge AI decisions. This could lead to fairer automated systems in Australian institutions, potentially bringing significant social, commercial, and economic benefits by mitigating the risks associated with unexplained AI systems.

Who's involved

Chief Investigator

Associate Professor Sam Baron, School of Historical and Philosophical Studies, University of Melbourne

Co investigators

Associate Professor Piers Howe, Melbourne School of Psychological Sciences, University of Melbourne

Professor Liz Sonenberg, Pro Vice Chancellor Systems Innovation, Chancellery Research and Enterprise, School of Computing and Information Systems, University of Melbourne

MDAP research collaborators

Dr Aleks MichalewiczKim Doyle and Dr Mel Mistica