Built year prediction from Buddha face with heterogeneous labels

Abstract

Buddha statues are a part of human culture, especially of the Asia area, and they have been alongside human civilisation for more than 2,000 years. As history goes by, due to wars, natural disasters, and other reasons, the records that show the built years of Buddha statues went missing, which makes it an immense work for historians to estimate the built years. In this paper, we pursue the idea of building a neural network model that automatically estimates the built years of Buddha statues based only on their face images. Our model uses a loss function that consists of three terms: an MSE loss that provides the basis for built year estimation; a KL divergence-based loss that handles the samples with both an exact built year and a possible range of built years (e.g., dynasty or centuries) estimated by historians; finally a regularisation that utilises both labelled and unlabelled samples based on manifold assumption. By combining those three terms in the training process, we show that our method is able to estimate built years for given images with 37.5 years of a mean absolute error on the test set.

Publication
Proc.~Workshop on Structuring and Understanding of Multimedia Heritage Contents (SUMAC)
Yiming Qian
Yiming Qian
Guest Researcher

Manisha’s research interest broadly lies in computer vision and image processing. Currently, she is working on micro facial expression recognition using multi-model deep learning frameworks.

Yuta Nakashima
Yuta Nakashima
Professor

Yuta Nakashima is a professor with Institute for Datability Science, Osaka University. His research interests include computer vision, pattern recognition, natural langauge processing, and their applications.

Benjamin Renoust
Benjamin Renoust
Guest Associate Professor
Hajime Nagahara
Hajime Nagahara
Professor

He is working on computer vision and pattern recognition. His main research interests lie in image/video recognition and understanding, as well as applications of natural language processing techniques.