Action recognition from a single coded image

Tadashi Okawara, 吉田道隆, 長原一, Yasushi Yagi

1月, 2020

概要

Cameras are prevalent in society at the present time, for example, surveillance cameras, and smartphones equipped with cameras and smart speakers. There is an increasing demand to analyze human actions from these cameras to detect unusual behavior or within a man-machine interface for Internet of Things (IoT) devices. For a camera, there is a trade-off between spatial resolution and frame rate. A feasible approach to overcome this trade-off is compressive video sensing. Compressive video sensing uses random coded exposure and reconstructs higher than read out of sensor frame rate video from a single coded image. It is possible to recognize an action in a scene from a single coded image because the image contains multiple temporal information for reconstructing a video. In this paper, we propose reconstruction-free action recognition from a single coded exposure image. We also proposed deep sensing framework which models camera sensing and classification models into convolutional neural network (CNN) and jointly optimize the coded exposure and classification model simultaneously. We demonstrated that the proposed method can recognize human actions from only a single coded image. We also compared it with competitive inputs, such as low-resolution video with a high frame rate and high-resolution video with a single frame in simulation and real experiments.

論文種別

Conference paper

発表文献

Proceedings - 2020 IEEE International Conference on Computational Photography (ICCP)

Action recognition from a single coded image

概要

吉田道隆

博士後期課程学生

長原一

教授