Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers

Katharina, Hoedt; Verena, Praher; Arthur, Flexer

Thông tin tài liệu

Nhan đề :

Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers

Tác giả :

Katharina, Hoedt
Verena, Praher
Arthur, Flexer

Năm xuất bản :

2022

Nhà xuất bản :

Springer

Tóm tắt :

Given the rise of deep learning and its inherent black-box nature, the desire to interpret these systems and explain their behaviour became increasingly more prominent. The main idea of so-called explainers is to identify which features of particular samples have the most influence on a classifier’s prediction, and present them as explanations. Evaluating explainers, however, is difficult, due to reasons such as a lack of ground truth. In this work, we construct adversarial examples to check the plausibility of explanations, perturbing input deliberately to change a classifier’s prediction. This allows us to investigate whether explainers are able to detect these perturbed regions as the parts of an input that strongly influence a particular classification. Our results from the audio and image domain suggest that the investigated explainers often fail to identify the input regions most relevant for a prediction; hence, it remains questionable whether explanations are useful or potentially misleading.

Mô tả:

CC BY

URI:

https://link.springer.com/article/10.1007/s00521-022-07918-7
https://dlib.phenikaa-uni.edu.vn/handle/PNK/8325

Bộ sưu tập

OER - Công nghệ thông tin

XEM MÔ TẢ

101

XEM TOÀN VĂN

50

Danh sách tệp tin đính kèm:

Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers-2022.pdf
Restricted Access

Dung lượng : 3,17 MB
Định dạng : Adobe PDF

Xem trực tuyến Tải tài liệu

Biểu ghi đầy đủ Thống kê truy cập