SCOUTER: Slot attention-based classifier for explainable image recognition

Abstract

Explainable artificial intelligence has been gaining attention in the past few years. However, most existing methods are based on gradients or intermediate features, which are not directly involved in the decision-making process of the classifier. In this paper, we propose a slot attention-based classifier called SCOUTER for transparent yet accurate classification. Two major differences from other attention-based methods include: (a) SCOUTER’s explanation is involved in the final confidence for each category, offering more intuitive interpretation, and (b) all the categories have their corresponding positive or negative explanation, which tells "why the image is of a certain category" or "why the image is not of a certain category." We design a new loss tailored for SCOUTER that controls the model’s behavior to switch between positive and negative explanations, as well as the size of explanatory regions. Experimental results show that SCOUTER can give better visual explanations in terms of various metrics while keeping good accuracy on small and medium-sized datasets.

Publication
Proc.~IEEE/CVF International Conference on Computer Vision (ICCV)
Liangzhi Li
Liangzhi Li
Guest Assistant Professor

His research interests lie in deep learning, computer vision, robotics, and medical images.

Bowen Wang
Bowen Wang
Specially-Appointed Researcher/Fellow
Manisha Verma
Manisha Verma
Specially-Appointed Researcher/Fellow

Manisha’s research interest broadly lies in computer vision and image processing. Currently, she is working on micro facial expression recognition using multi-model deep learning frameworks.

Yuta Nakashima
Yuta Nakashima
Professor

Yuta Nakashima is a professor with Institute for Datability Science, Osaka University. His research interests include computer vision, pattern recognition, natural langauge processing, and their applications.

Hajime Nagahara
Hajime Nagahara
Professor

He is working on computer vision and pattern recognition. His main research interests lie in image/video recognition and understanding, as well as applications of natural language processing techniques.