XSub: Explanation-Driven Adversarial Attack against Blackbox Classifiers via Feature Substitution

Kiana Vu, Phung Lai, Truc Nguyen

Research output: Contribution to conferencePaper

2 Scopus Citations

Abstract

Despite its significant benefits in enhancing the transparency and trustworthiness of artificial intelligence (AI) systems, explainable AI (XAI) can unintentionally provide adversaries with insights into blackbox models, increasing their vulnerability to various attacks. In this paper, we develop a novel explanation-driven adversarial attack against blackbox classifiers based on feature substitution, called XSub. The key idea of XSub is to strategically replace important features (identified via XAI) in the original sample with corresponding important features of a different label, thereby increasing the likelihood of the model misclassifying the perturbed sample. XSub only requires a minimal number of queries and can be easily extended to launch backdoor attacks in case the attacker has access to the model's training data. Our evaluation shows that XSub is not only effective and stealthy but also low-cost, showcasing its feasibility across a wide range of AI applications.
Original languageAmerican English
Number of pages6
DOIs
StatePublished - 2025
Event2024 IEEE International Conference on Big Data (IEEE BigData 2024) - Washington, DC
Duration: 15 Dec 202418 Dec 2024

Conference

Conference2024 IEEE International Conference on Big Data (IEEE BigData 2024)
CityWashington, DC
Period15/12/2418/12/24

NREL Publication Number

  • NREL/CP-2C00-91278

Keywords

  • adversarial attack
  • adversarial machine learning
  • backdoor attack
  • explainable AI

Fingerprint

Dive into the research topics of 'XSub: Explanation-Driven Adversarial Attack against Blackbox Classifiers via Feature Substitution'. Together they form a unique fingerprint.

Cite this