Explain me the painting: Multi-topic knowledgeable art description generation


Have you ever looked at a painting and wondered what is the story behind it? This work presents a framework to bring art closer to people by generating comprehensive descriptions of fine-art paintings. Generating informative descriptions for artworks, however, is extremely challenging, as it requires to 1) describe multiple aspects of the image such as its style, content, or composition, and 2) provide background and contextual knowledge about the artist, their influences, or the historical period. To address these challenges, we introduce a multi-topic and knowledgeable art description framework, which modules the generated sentences according to three artistic topics and, additionally, enhances each description with external knowledge. The framework is validated through an exhaustive analysis, both quantitative and qualitative, as well as a comparative human evaluation, demonstrating outstanding results in terms of both topic diversity and information veracity.

Proc.~IEEE/CVF International Conference on Computer Vision (ICCV)
Yuta Nakashima
Yuta Nakashima
Associate Professor

Yuta Nakashima is an associate professor with Institute for Datability Science, Osaka University. His research interests include computer vision, pattern recognition, natural langauge processing, and their applications.

Noa Garcia
Noa Garcia
Specially-Appointed Assistant Professor

Her research interests lie in computer vision and machine learning applied to visual retrieval and joint models of vision and language for high-level understanding tasks.