SCOUTER: Slot attention-based classifier for explainable image recognition
Liangzhi Li, Bowen Wang, Manisha Verma, Yuta Nakashima, Ryo Kawasaki, Hajime Nagahara
November 2021
Abstract
Explainable artificial intelligence has been gaining attention in the past few years. However, most existing methods are based on gradients or intermediate features, which are not directly involved in the decision-making process of the classifier. In this paper, we propose a slot attention-based classifier called SCOUTER for transparent yet accurate classification. Two major differences from other attention-based methods include: (a) SCOUTER’s explanation is involved in the final confidence for each category, offering more intuitive interpretation, and (b) all the categories have their corresponding positive or negative explanation, which tells "why the image is of a certain category" or "why the image is not of a certain category." We design a new loss tailored for SCOUTER that controls the model’s behavior to switch between positive and negative explanations, as well as the size of explanatory regions. Experimental results show that SCOUTER can give better visual explanations in terms of various metrics while keeping good accuracy on small and medium-sized datasets.
Publication
Proc.~IEEE/CVF International Conference on Computer Vision (ICCV)
Specially-Appointed Assistant Professor
His research interests lie in deep learning, computer vision, robotics, and medical images.
Specially-Appointed Researcher/Fellow
Specially-Appointed Researcher/Fellow
Manisha’s research interest broadly lies in computer vision and image processing. Currently, she is working on micro facial expression recognition using multi-model deep learning frameworks.
Associate Professor
Yuta Nakashima is an associate professor with Institute for Datability Science, Osaka University. His research interests include computer vision, pattern recognition, natural langauge processing, and their applications.
Professor
He is working on computer vision and pattern recognition. His main research interests lie in image/video recognition and understanding, as well as applications of natural language processing techniques.