Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

1月, 2023

概要

Video summarization aims to select a most informative subset of frames in a video to facilitate efficient video browsing. Unsupervised methods usually rely on heuristic training objectives such as diversity and representativeness. However, such methods need to bootstrap the online-generated summaries to compute the objectives for importance score regression. We consider such a pipeline inefficient and seek to directly quantify the frame-level importance with the help of contrastive losses in the representation learning literature. Leveraging the contrastive losses, we propose three metrics featuring a desirable key frame: local dissimilarity, global consistency, and uniqueness. With features pre-trained on an image classification task, the metrics can already yield high-quality importance scores, demonstrating better or competitive performance compared with past heavily-trained methods. We show that by refining the pre-trained features with contrastive learning, the frame-level importance scores can be further improved, and the model can learn from random videos and generalize to test videos with decent performance.

論文種別

Conference paper

発表文献

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision

Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

概要

Zongshang Pang

博士後期課程学生

中島悠太

教授

長原一

教授