Publications

Automated grading system of retinal arterio-venous crossing patterns: A deep learning approach replicating ophthalmologist’s diagnostic process of arteriolosclerosis

The morphological feature of retinal arterio-venous crossing patterns is a valuable source of cardiovascular risk stratification as it …

Liangzhi Li, Manisha Verma, Bowen Wang, 中島悠太, 長原一, Ryo Kawasaki

Analyzing Font Style Usage and Contextual Factors in Real Images

There are various font styles in the world. Different styles give different impressions and readability. This paper analyzes the …

Naoya Yasukochi, 早志英朗, Daichi Haraguchi, Seiichi Uchida

ACT2G: Attention-based Contrastive Learning for Text-to-Gesture Generation

Recent increase of remote-work, online meeting and tele-operation task makes people find that gesture for avatars and communication …

Hitoshi Teshima, Naoki Wake, Diego Thomas, 中島悠太, Hiroshi Kawasaki, Katsushi Ikeuchi

Quantifying Societal Bias Amplification in Image Captioning

Vision-and-language tasks have increasingly drawn more attention as a means to evaluate human-like reasoning in machine learning …

廣田裕亮, 中島悠太, Noa Garcia

Gender and Racial Bias in Visual Question Answering Datasets

廣田裕亮, 中島悠太, Noa Garcia

AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

Evaluation measures have a crucial impact on the direction of research. Therefore, it is of utmost importance to develop appropriate …

Riku Togashi, Mayu Otani, 中島悠太, Janne Heikkilä Esa Rahtu, Tetsuya Sakai

Acquiring a Dynamic Light Field Through a Single-Shot Coded Image

We propose a method for compressively acquiring a dynamic light field (a 5-D volume) through a single-shot coded image (a 2-D …

Ryoya Mizuno, Keita Takahashi, 吉田道隆, Chihiro Tsutake, Toshiaki Fujii, 長原一

Information Extraction from Public Meeting Articles

Public meeting articles are the key to understanding the history of public opinion and public sphere in Australia. Information …

Felix Giovanni Virgo, Chenhui Chu, Takaya Ogawa, Koji Tanaka, Kazuki Ashihara, 中島悠太, 武村紀子, 長原一, Takao Fujikawa

Anonymous identity sampling and reusable synthesis for sensitive face camouflage

An increasing amount of face images are being captured, shared, or applied in various applications. These images usually contain lots …

Zhenzhong Kuang, Longbin Teng, Xingchi He, Jiajun Ding, 中島悠太, Noboru Babaguchi

Tone Classification for Political Advertising Video using Multimodal Cues

Politics has always gotten much attention throughout history, and video advertisement has become one of the most essential tools for …

Anh-Khoa Vo, 中島悠太

Multi-label disengagement and behavior prediction in online learning

Student disengagement prediction in online learning environments is beneficial in various ways, especially to help provide timely cues …

Manisha Verma, 中島悠太, 武村紀子, 長原一

Match them up: visually explainable few-shot image classification

Few-shot learning (FSL) approaches, mostly neural network-based, assume that pre-trained knowledge can be obtained from base (seen) …

Bowen Wang, Liangzhi Li, Manisha Verma, 中島悠太, Ryo Kawasaki, 長原一

Integration of gesture generation system using gesture library with DIY robot design kit

Conversational agents are expected to improve the quality of communication by adding gestures to the speech, and are considered to be a …

Hitoshi Teshima, Naoki Wake, Diego Thomas, 中島悠太, David Baumert, Hiroshi Kawasaki, Katsushi Ikeuchi

ICDAR'22: Intelligent Cross-Data Analysis and Retrieval

We have witnessed the rise of cross-data against multimodal data problems recently. The cross-modal retrieval system uses a textual …

Minh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen, Cathal Gurrin, 中島悠太, Mianxiong Dong

Human--Machine Interfaces Based on Bioelectric Signals: A Narrative Review with a Novel System Proposal

Bioelectric signals such as electromyogram (EMG) and electroencephalogram (EEG) reflect human internal states and intended actions, and …

早志英朗, Toshio Tsuji

Emotional Intensity Estimation based on Writer’s Personality

We propose a method for personalized emotional intensity estimation based on a writer’s personality test for Japanese SNS posts. …

Haruya Suzuki, Sora Tarumoto, 梶原智之, Takashi Ninomiya, 中島悠太, 長原一

Depthwise spatio-temporal STFT convolutional neural networks for human action recognition

Conventional 3D convolutional neural networks (CNNs) are computationally expensive, memory intensive, prone to overfitting, and most …

Sudhakar Kumawat, Manisha Verma, 中島悠太, Shanmuganathan Raman

Deep Gesture Generation for Social Robots Using Type-Specific Libraries

Body language such as conversational gesture is a powerful way to ease communication. Conversational gestures do not only make a speech …

Hitoshi Teshima, Naoki Wake, Diego Thomas, 中島悠太, Hiroshi Kawasaki, Katsushi Ikeuchi

Deep Bayesian Active Learning to Rank for Endoscopic Image Data

Automatic image-based disease severity estimation generally uses discrete (i.e., quantized) severity labels. Annotating discrete labels …

Takeaki Kadota, 早志英朗, Ryoma Bise, Kiyohito Tanaka, Seiichi Uchida

Corpus Construction for Historical Newspapers: A Case Study on Public Meeting Corpus Construction Using OCR Error Correction

Koji Tanaka, Chenhui Chu, 梶原智之, 中島悠太, 武村紀子, 長原一, Takao Fujikawa

The semantic typology of visually grounded paraphrases

Visually grounded paraphrases (VGPs) are different phrasal expressions describing the same visual concept in an image. Previous studies …

Chenhui Chu, Vinicius Oliveira, Felix Giovanni Virgo, Mayu Otani, Noa Garcia, 中島悠太

Transferring domain-agnostic knowledge in video question answering

Tianran Wu, Noa Garcia, Mayu Otani, Chenhui Chu, 中島悠太, Haruo Takemura

SCOUTER: Slot attention-based classifier for explainable image recognition

Explainable artificial intelligence has been gaining attention in the past few years. However, most existing methods are based on …

Liangzhi Li, Bowen Wang, Manisha Verma, 中島悠太, Ryo Kawasaki, 長原一

Image Retrieval by Hierarchy-aware Deep Hashing Based on Multi-task Learning

Deep hashing has been widely used to approximate nearest-neighbor search for image retrieval tasks. Most of them are trained with …

Bowen Wang, Liangzhi Li, 中島悠太, Takehiro Yamamoto, Hiroaki Ohshima, Yoshiyuki Shoji, Kenro Aihara, Noriko Kando

GCNBoost: Artwork Classificationby Label Propagation Through a Knowledge Graph

Video question answering (VideoQA) is designed to answer a given question based on a relevant video clip. The current available …

Cheikh Brahim El Vaigh, Noa Garcia, Benjamin Renoust, Chenhui Chu, 中島悠太, 長原一

Explain me the painting: Multi-topic knowledgeable art description generation

Have you ever looked at a painting and wondered what is the story behind it? This work presents a framework to bring art closer to …

Zechen Bai, 中島悠太, Noa Garcia

Built year prediction from Buddha face with heterogeneous labels

Buddha statues are a part of human culture, especially of the Asia area, and they have been alongside human civilisation for more than …

Yiming Qian, Cheikh Brahim El Vaigh, 中島悠太, Benjamin Renoust, 長原一, Yutaka Fujioka

PoseRN: A 2D pose refinement network for bias-free multi-view 3D human pose estimation

We propose a new 2D pose refinement network that learns to predict the human bias in the estimated 2D pose. There are biases in 2D pose …

Akihiko Sayo, Diego Thomas, Hiroshi Kawasaki, 中島悠太, Katsushi Ikeuchi

Museum Experience into a Souvenir: Generating Memorable Postcards from Guide Device Behavior Log

This paper proposes a method for automatically generating postcards that reflect each visitor’s museum experience by analyzing …

Yoshiyuki Shoji, Kenro Aihara, Noriko Kando, 中島悠太, Hiroaki Ohshima, Shio Takidaira, Masaki Ueta, Takehiro Yamamoto, Yusuke Yamamoto

Learners' efficiency prediction using facial behavior analysis

In the e-learning context, how much the learner is concentrated and engaged, or the learners’ efficiency, is essential for …

Manisha Verma, 中島悠太, Hirokazu Kobori, Ryota Takaoka, 武村紀子, Tsukasa Kimura, 長原一, Masayuki Numao, Kazumitsu Shinohara

Attending self-attention: A case study of visually grounded supervision in vision-and-language transformers

The impressive performances of pre-trained visually grounded language models have motivated a growing body of research investigating …

Jules Samaran, Noa Garcia, Mayu Otani, Chenhui Chu, 中島悠太

A comparative study of language Transformers for video question answering

With the goal of correctly answering questions about images or videos, visual question answering (VQA) has quickly developed in recent …

Zekun Yang, Noa Garcia, Chenhui Chu, Mayu Otani, 中島悠太, Haruo Takemura

WRIME: A new dataset for emotional intensity estimation with subjective and objective annotations

We annotate 17,000 SNS posts with both the writer’s subjective emotional intensity and the reader’s objective one to construct a …

梶原智之, Chenhui Chu, 武村紀子, 中島悠太, 長原一

MTUNet: Few-shot image classification with visual explanations

Few-shot learning (FSL) approaches, mostly neural network-based, are assuming that the pre-trained knowledge can be obtained from base …

Bowen Wang, Liangzhi Li, Manisha Verma, 中島悠太, Ryo Kawasaki, 長原一

Noisy-LSTM: Improving temporal awareness for video semantic segmentation

Semantic video segmentation is a key challenge for various applications. This paper presents a new model named Noisy-LSTM, which is …

Bowen Wang, Liangzhi Li, 中島悠太, Ryo Kawasaki, 長原一, Yasushi Yagi

The laughing machine: Predicting humor in video

Humor is a very important communication tool; yet, it is an open problem for machines to understand humor. In this paper, we build a …

Yuta Kayatani, Zekun Yang, Mayu Otani, Noa Garcia, Chenhui Chu, 中島悠太, Haruo Takemura

Preventing fake information generation against media clone attacks

Fake media has been spreading due to remarkable advances in media processing and machine leaning technologies, causing serious problems …

Noboru Babaguchi, Isao Echizen, Junichi Yamagishi, Naoko Nitta, 中島悠太, Kazuaki Nakamura, Kazuhiro Kono, Seiko Myojin Fuming Fand, Zhenzhong Kuang, Huy H Nguyen, Ngoc-Dung T Tieu

Generation and detection of media clones

With the spread of high-performance sensors and social network services (SNS) and the remarkable advances in machine learning …

Isao Echizen, Noboru Babaguchi, Junichi Yamagishi, Naoko Nitta, 中島悠太, Kazuaki Nakamura, Kazuhiro Kono, Fuming Fand, Seiko Myojin, Zhenzhong Kuang, Huy H Nguyen, Ngoc-Dung T Tieu

CFA Handling and Quality Analysis for Compressive Light Field Camera

A light field can carry rich visual information of a real 3-D scene, leading to many attractive applications. However, the acquisition …

Kohei Sakai, Yasutaka Inagaki, Keita Takahashi, Toshiaki Fujii, 長原一

Cross-lingual visual grounding

Visual grounding is a vision and language understanding task aiming at locating a region in an image according to a specific query …

Wenjian Dong, Mayu Otani, Noa Garcia, 中島悠太, Chenhui Chu

IDSOU at WNUT-2020 Task 2: Identification of informative COVID-19 English tweets

We introduce the IDSOU submission for the WNUT-2020 task 2: identification of informative COVID-19 English Tweets. Our system is an …

Sora Ohashi, 梶原智之, Chenhui Chu, 武村紀子, 中島悠太, 長原一

Improving topic modeling through homophily for legal documents

Topic modeling that can automatically assign topics to legal documents is very important in the domain of computational law. The …

Kazuki Ashihara, Cheikh Brahim El Vaigh, Chenhui Chu, Benjamin Renoust, Noriko Okubo, 武村紀子, 中島悠太, 長原一

Following Embryonic Stem Cells, Their Differentiated Progeny, and Cell-State Changes During iPS Reprogramming by Raman Spectroscopy

Monitoring cell-state transition in pluripotent cells is invaluable for application and basic research. In this study, we demonstrate …

Arno Germond, Yulia Panina, Mikio Shiga, 新岡宏彦, Tomonobu M. Watanabe

Diagnostic performance for pulmonary adenocarcinoma on CT: comparison of radiologists with and without three-dimensional convolutional neural network

Objectives To compare diagnostic performance for pulmonary invasive adenocarcinoma among radiologists with and without …

Masahiro Yanagawa, 新岡宏彦, Masahiko Kusumoto, Kazuo Awai, Mitsuko Tsubamoto, Yukihisa Satoh, Tomo Miyata, Yuriko Yoshida, Noriko Kikuchi, Akinori Hata, Shohei Yamasaki, Shoji Kido, 長原一, Jun Miyake, Noriyuki Tomiyama

Visually grounded paraphrase identification via gating and phrase localization

Visually grounded paraphrases (VGPs) describe the same visual concept but in different wording. Previous studies have developed models …

Mayu Otani, Chenhui Chu, 中島悠太

Red-Fluorescent Pt Nanoclusters for Detecting and Imaging HER2 in Breast Cancer Cells

Overexpression of human epidermal growth factor receptor 2 (HER2) is associated with more frequent cancer recurrence and metastasis. …

Shin-ichi Tanaka, Hiroki Wadati, Kazuhisa Sato, Hidehiro Yasuda, 新岡宏彦

Improvement of nerve imaging speed with coherent anti-Stokes Raman scattering rigid endoscope using deep-learning noise reduction

A coherent anti-Stokes Raman scattering (CARS) rigid endoscope was developed to visualize peripheral nerves without labeling for …

Naoki Yamato, 新岡宏彦, Jun Miyake, Mamoru Hashimoto

YOLO in the Dark - Domain adaptation method for merging multiple models -

Generating models to handle new visual tasks requires additional datasets, which take considerable effort to create. We propose a …

Yukihiro Sasagawa, 長原一

Knowledge-based video question answering with unsupervised scene descriptions

To understand movies, humans constantly reason over the dialogues and actions shown in specific scenes and relate them to the overall …

Noa Garcia, 中島悠太

Demographic influences on contemporary art with unsupervised style embeddings

Computational art analysis has, through its reliance on classification tasks, prioritised historical datasets in which the artworks are …

Nikolai Huckle, Noa Garcia, 中島悠太

Acquiring dynamic light fields through coded aperture camera

We investigate the problem of compressive acquisition of a dynamic light field. A promising solution for compressive light field …

Kohei Sakai, Keita Takahashi, Toshiaki Fujii, 長原一

Nerve segmentation with deep learning from label-free endoscopic images obtained using coherent anti-stokes Raman scattering

Semantic segmentation with deep learning to extract nerves from label-free endoscopic images obtained using coherent anti-Stokes Raman …

Naoki Yamato, Mana Matsuya, 新岡宏彦, Jun Miyake, Mamoru Hashimoto

公開集会記事からの情報抽出

田中昂志, 芦原和樹, Chenhui Chu, 中島悠太, 武村紀子, 長原一, 藤川隆男

OCR誤り訂正を⽤いた歴史新聞データからのコーパス構築

⽥中昂志, Chenhui Chu, 梶原智之, 中島悠太, 武村紀⼦, ⻑原⼀, 藤川隆男

Constructing a public meeting corpus

In this paper, we propose a method for constructing a large corpus about a century of public meetings in historical Australian …

Yoga-82: a new dataset for fine-grained classification of human poses

Human pose estimation is a well-known problem in computer vision to locate joint positions. Existing datasets for the learning of poses …

Manisha Verma, Sudhakar Kumawat, 中島悠太, Shanmuganathan Raman

Convolutional Neural Network Can Recognize Drug Resistance of Single Cancer Cells

textlessptextgreaterIt is known that single or isolated tumor cells enter cancer patients’ circulatory systems. These circulating …

Kiminori Yanagisawa, Masayasu Toratani, Ayumu Asai, Masamitsu Konno, 新岡宏彦, Tsunekazu Mizushima, Taroh Satoh, Jun Miyake, Kazuhiko Ogawa, Andrea Vecchione, Yuichiro Doki, Hidetoshi Eguchi, Hideshi Ishii

Detecting learner drowsiness based on facial expressions and head movements in online courses

Drowsiness is a major factor that hinders learning. To improve learning efficiency, it is important to understand students’ …

Shogo Terai, Shizuka Shirai, Mehrasa Alizadeh, Ryosuke Kawamura, 武村紀子, Yuki Uranishi, Haruo Takemura, 長原一

KnowIT VQA: Answering knowledge-based questions about videos

We propose a novel video understanding task by fusing knowledge-based and video question answering. First, we introduce KnowIT VQA, a …

Noa Garcia, Mayu Otani, Chenhui Chu, 中島悠太

Warmer Environments Increase Implicit Mental Workload Even If Learning Efficiency Is Enhanced

T. Kimura, N. Takemura, Y. Nakashima, H. Kobori, H. Nagahara, M. Numao, K. Shinohara

Speech-driven face reenactment for a video sequence

We present a system for reenacting a person’s face driven by speech. Given a video sequence with the corresponding audio track of …

中島悠太, Takaaki Yasui, Leon Nguyen, Noboru Babaguchi

Joint learning of vessel segmentation and artery/vein classification with post-processing

Retinal imaging serves as a valuable tool for diagnosis of various diseases. However, reading retinal images is a difficult and …

Liangzhi Li, Manisha Verma, 中島悠太, Ryo Kawasaki, 長原一

IterNet: retinal image segmentation utilizing structural redundancy in vessel networks

Retinal vessel segmentation is of great interest for diagnosis of retinal vascular diseases. To further improve the performance of …

Liangzhi Li, Manisha Verma, 中島悠太, 長原一, Ryo Kawasaki

ContextNet: representation and exploration for painting classification and retrieval in context

Noa Garcia, Benjamin Renoust, 中島悠太

BERT representations for video question answering

Visual question answering (VQA) aims at answering questions about the visual content of an image or a video. Currently, most work on …

Zekun Yang, Noa Garcia, Chenhui Chu, Mayu Otani, 中島悠太, Haruo Takemura

Action recognition from a single coded image

Cameras are prevalent in society at the present time, for example, surveillance cameras, and smartphones equipped with cameras and …

Tadashi Okawara, 吉田道隆, 長原一, Yasushi Yagi

5D Light Field Synthesis from a Monocular Video

Commercially available light field cameras have difficulty in capturing 5D (4D + time) light field videos. They can only capture still …

Kyuho Bae, Andre Ivan, 長原一, In Kyu Park

3D Image Reconstruction from Multi-focus Microscopic Images

This paper presents a method for reconstructing 3D image from multi-focus microscopic images captured with different focuses. We model …

Takahiro Yamaguchi, 長原一, Ken'ichi Morooka, 中島悠太, Yuki Uranishi, Shoko Miyauchi, Ryo Kurazume

歴史研究におけるビッグデータの活用－オーストラリアを中心に

藤川隆男, Chenhui Chu, 梶原智之, 長原一

Reflectance and Shape Estimation with a Light Field Camera Under Natural Illumination

Reflectance and shape are two important components in visually perceiving the real world. Inferring the reflectance and shape of an …

Thanh Trung Ngo, 長原一, Ko Nishino, Rin Ichiro Taniguchi, Yasushi Yagi

Public meeting corpus construction and content delivery

Chenhui Chu, Koji Tanaka, Haolin Ren, Benjamin Renoust, 中島悠太, 武村紀子, 長原一, Takao Fujikawa

Deep-UV excitation fluorescence microscopy for detection of lymph node metastasis using deep neural network

Tatsuya Matsumoto, 新岡宏彦, Yasuaki Kumamoto, Junya Sato, Osamu Inamori, Ryuta Nakao, Yoshinori Harada, Eiichi Konishi, Eigo Otsuji, Hideo Tanaka, Jun Miyake, Tetsuro Takamatsu

Contextualized multi-sense word embedding

Currently, distributed word representations are employed in many natural language processing tasks. However, when generating one …

Kazuki Ashihara, 梶原智之, Yuki Arase, Satoru Uchida

Legal information as a complex network: Improving topic modeling through homophily

Topic modeling is a key component to computational legal science. Network analysis is also very important to further understand the …

Kazuki Ashihara, Chenhui Chu, Benjamin Renoust, Noriko Okubo, 武村紀子, 中島悠太, 長原一

Human shape reconstruction with loose clothes from partially observed data by pose specific deformation

Reconstructing the entire body of moving human in a computer is important for various applications, such as tele-presence, virtual …

Akihiko Sayo, Hayato Onizuka, Diego Thomas, 中島悠太, Hiroshi Kawasaki, Katsushi Ikeuchi

Deep compressive sensing for visual privacy protection in flatcam imaging

Detection followed by projection in conventional privacy cameras is vulnerable to software attacks that threaten to expose image sensor …

Thuong Nguyen Canh, 長原一

Metric for automatic machine translation evaluation based on pre-trained sentence embeddings

This study describes a segment-level metric for automatic machine translation evaluation (MTE). Although various MTE metrics have been …

Hiroki Shimanaka, 梶原智之, Mamoru Komachi

A 3-D Display Pipeline from Coded-Aperture Camera to Tensor Light-Field Display Through CNN

We propose an efficient pipeline from input to output for a tensor light-field display. Conventionally, a dense light field (i.e., tens …

Keita Maruyama, Yasutaka Inagaki, Keita Takahashi, Toshiaki Fujii, 長原一

Excitation of erbium-doped nanoparticles in 1550-nm wavelength region for deep tissue imaging with reduced degradation of spatial resolution

Masahito Yamanaka, 新岡宏彦, Taichi Furukawa, Norihiko Nishizawa

Application of deep learning (3-dimensional convolutional neural network) for the prediction of pathological invasiveness in lung adenocarcinoma

Masahiro Yanagawa, 新岡宏彦, Akinori Hata, Noriko Kikuchi, Osamu Honda, Hiroyuki Kurakami, Eiichi Morii, Masayuki Noguchi, Yoshiyuki Watanabe, Jun Miyake, Noriyuki Tomiyama

歴史新聞データからのコーパス構築

田中昂志, Chenhui Chu, 中島悠太, 武村紀子, 長原一, 藤川隆男

Multimodal learning analytics: Society 5.0 project in Japan

Shizuka Shirai, 武村紀子, 中島悠太, 長原一, Haruo Takemura

Fall detection using optical level anonymous image sensing system

Fall is one of the leading causes of injury for the elderly individuals. Systems that automatically detect falls can significantly …

Chao Ma, Atsushi Shimada, Hideaki Uchiyama, 長原一, Rin Ichiro Taniguchi

Video meets knowledge in visual question answering

In this work, we address knowledge-based visual question answering in videos. First, we introduce KnowIT VQA, a video dataset with …

Noa Garcia, Chenhui Chu, Mayu Otani, 中島悠太

Rethinking the evaluation of video summaries

Video summarization is a technique to create a short skim of the original video while preserving the main stories/content. There exists …

Mayu Otani, 中島悠太, Esa Rahtu, Janne Heikkilä

Negative lexically constrained decoding for paraphrase generation

Paraphrase generation can be regarded as monolingual translation. Unlike bilingual machine translation, paraphrase generation rewrites …

梶原智之

Historical and modern features for Buddha statue classification

B. Renoust, M.O. Franca, J. Chan, N. Garcia, V. Le, A. Uesaka, Y. Nakashima, H. Nagahara, J. Wang, Y. Fujioka

High-Speed Imaging Using CMOS Image Sensor With Quasi Pixel-Wise Exposure

Several recent studies on compressive video sensing realized scene capture beyond the fundamental trade-off limit between spatial …

吉田道隆, Toshiki Sonoda, 長原一, Kenta Endo, Yukinobu Sugiyama, Rin Ichiro Taniguchi

Facial expression recognition with skip-connection to leverage low-level features

Deep convolutional neural networks (CNNs) have established their feet in the ground of computer vision and machine learning, used in …

Manisha Verma, Hirokazu Kobori, 中島悠太, 武村紀子, 長原一

Efficacy of Novel Multispectral Imaging Device to Determine Anastomosis for Esophagogastrostomy

R. Tsutsumi, T. Ikeda, H. Nagahara, H. Saeki, Y. Nakashima, E. Oki, Y. Maehara, M. Hashizume

Controllable text simplification with lexical constraint loss

We propose a method to control the level of a sentence in a text simplification task. Text simplification is a monolingual translation …

Daiki. Nishihara, Tomoyuki. Kajiwara, Yuki. Arase

Contextualized context2vec

Lexical substitution ranks substitution candidates from the viewpoint of paraphrasability for a target word in a given sentence. There …

Kazuki Ashihara, 梶原智之, Yuki Arase, Satoru Uchida

Context-aware embeddings for automatic art analysis

Noa Garcia, Benjamin Renoust, 中島悠太

Buda.art: A multimodal content-based analysis and retrieval system for Buddha statues

Benjamin Renoust, Matheus Oliveira M.O. Franca, Jacob Chan, Van Le, Ayaka Uesaka, 中島悠太, 長原一, Jueren Wang, Yutaka Fujioka

A Coded Aperture for Watermark Extraction from Defocused Images

H. Hamasaki, S. Takeshita, K. Nakai, T. Sonoda, H. Kawasaki, H. Nagahara, S. Ono

Space-time-brightness sampling using an adaptive pixel-wise coded exposure

Most conventional digital video cameras face a fundamental trade-off between spatial resolution, temporal resolution and dynamic range …

長原一, Dengyu Liu, Toshiki Sonoda, Jinwei Gu

Representing a partially observed non-rigid 3D human using eigen-texture and eigen-deformation

Reconstruction of the shape and motion of humans from RGB-D is a challenging problem, receiving much attention in recent years. Recent …

Ryosuke Kimura, Akihiko Sayo, Fabian Lorenzo Dayrit, 中島悠太, Hiroshi Kawasaki, Ambrosio Blanco, Katsushi Ikeuchi

Finding important people in a video using deep neural networks with conditional random fields

Finding important regions is essential for applications, such as content-aware video compression and video retargeting to automatically …

Mayu Otani, Atsushi Nishida, 中島悠太, Tomokazu Sato, Naokazu Yokoya

Invited Article: Label-free nerve imaging with a coherent anti-Stokes Raman scattering rigid endoscope using two optical fibers for laser delivery

Keigo Hirose, Shuichiro Fukushima, Taichi Furukawa, 新岡宏彦, Mamoru Hashimoto

Designing coded aperture camera based on PCA and NMF for light field acquisition

A light field, which is often understood as a set of dense multi-view images, has been utilized in various 2D/3D applications. …

Yusuke Yagi, Keita Takahashi, Toshiaki Fujii, Toshiki Sonoda, 長原一

Summarization of user-generated sports video by using deep action recognition features

Automatically generating a summary of a sports video poses the challenge of detecting interesting moments, or highlights, of a game. …

Antonio Tejero-De-Pablos, 中島悠太, Tomokazu Sato, Naokazu Yokoya, Marko Linna, Esa Rahtu

Iterative applications of image completion with CNN-based failure detection

Image completion is a technique to fill missing regions in a damaged or redacted image. A patch-based approach is one of major …

Takahiro Tanaka, Norihiko Kawai, 中島悠太, Tomokazu Sato, Naokazu Yokoya

iParaphrasing: Extracting visually grounded paraphrases via an image

A paraphrase is a restatement of the meaning of a text in other words. Paraphrases have been studied to enhance the performance of many …

Chenhui Chu, Mayu Otani, 中島悠太

PCA-coded aperture for light field photography

A light field, which is often understood as a set of dense multi-view images, has been utilized in various 2D/3D applications. …

Yusuke Yagi, Keita Takahashi, Toshiaki Fujii, Toshiki Sonoda, 長原一

Visually grounded paraphrase extraction

Chenhui Chu, Mayu Otani, 中島悠太

The dynamic photometric stereo method using a multi-tap CMOS image sensor

The photometric stereo method enables estimation of surface normals from images that have been captured using different but known …

T. Yoda, H. Nagahara, R.-I. Taniguchi, K. Kagawa, K. Yasutomi, S. Kawahito

RUSE: Regressor using sentence embeddings for automatic machine translation evaluation

We introduce the RUSE metric for the WMT18 metrics shared task. Sentence embeddings can capture global information that cannot be …

Hiroki Shimanaka, 梶原智之, Mamoru Komachi

Metric for automatic machine translation evaluation based on universal sentence representations

Sentence representations can capture a wide range of information that cannot be captured by local features based on character or word …

Hiroki Shimanaka, 梶原智之, Mamoru Komachi

Learning to capture light fields through a coded aperture camera

We propose a learning-based framework for acquiring a light field through a coded aperture camera. Acquiring a light field is a …

Yasutaka Inagaki, Yuto Kobayashi, Keita Takahashi, Toshiaki Fujii, 長原一

Joint optimization for compressive video sensing and reconstruction under hardware constraints

Compressive video sensing is the process of encoding multiple sub-frames into a single frame with controlled sensor exposures and …

吉田道隆, Akihiko Torii, Masatoshi Okutomi, Kenta Endo, Yukinobu Sugiyama, Rin Ichiro Taniguchi, 長原一

Graphical classification of DNA sequences of HLA alleles by deep learning

J. Miyake, Y. Kaneshita, S. Asatani, S. Tagawa, H. Niioka, T. Hirano

Complex word identification based on frequency in a learner corpus

We introduce the TMU systems for the Complex Word Identification (CWI) Shared Task 2018. TMU systems use random forest classifiers and …

梶原智之, Mamoru Komachi

Coherent anti-stokes Raman scattering rigid endoscope toward robot-assisted surgery

K. Hirose, T. Aoki, T. Furukawa, S. Fukushima, H. Niioka, S. Deguchi, M. Hashimoto

Adapting local features for face detection in thermal image

A thermal camera captures the temperature distribution of a scene as a thermal image. In thermal images, facial appearances of …

Chao Ma, Ngo Thanh Trung, Hideaki Uchiyama, 長原一, Atsushi Shimada, Rin Ichiro Taniguchi

Augmented reality marker hiding with texture deformation

Augmented reality (AR) marker hiding is a technique to visually remove AR markers in a real-time video stream. A conventional approach …

Norihiko Kawai, Tomokazu Sato, 中島悠太, Naokazu Yokoya

Adaptive background model registration for moving cameras

We propose a framework for adaptively registering background models with an image for background subtraction with moving cameras. …

Tsubasa Minematsu, Hideaki Uchiyama, Atsushi Shimada, 長原一, Rin Ichiro Taniguchi

Novel view synthesis with light-weight view-dependent texture mapping for a stereoscopic HMD

The proliferation of off-the-shelf head-mounted displays (HMDs) let end-users enjoy virtual reality applications, some of which render …

Thiwat Rongsirigul, 中島悠太, Tomokazu Sato, Naokazu Yokoya

Video summarization using textual descriptions for authoring video blogs

Authoring video blogs requires a video editing process, which is cumbersome for ordinary users. Video summarization can automate this …

Mayu Otani, 中島悠太, Tomokazu Sato, Naokazu Yokoya

Hyperspectral imaging using flickerless active LED illumination

Makoto Ohsaki, 長原一, Tetsuo Ikeda, Rin Ichiro Taniguchi

Video question answering to find a desired video eegment

Mayu Otani, 中島悠太, Esa Rahtu, Janne Heikkilä

Unsupervised Video Summarization using Deep Video Features

Mayu Otani, 中島悠太, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya

ReMagicMirror: Action learning using human reenactment with the mirror metaphor

We propose ReMagicMirror, a system to help people learn actions (e.g., martial arts, dances). We first capture the motions of a teacher …

Fabian Lorenzo Dayrit, Ryosuke Kimura, 中島悠太, Ambrosio Blanco, Hiroshi Kawasaki, Katsushi Ikeuchi, Tomokazu Sato, Naokazu Yokoya

Realtime novel view synthesis with eigen-texture regression

Realtime novel view synthesis, which generates a novel view of a real object or scene in realtime, enjoys a wide range of applications …

中島悠太, Fumio Okura, Norihiko Kawai, Hiroshi Kawasaki, Ambrosio Blanco, Katsushi Ikeuchi

Mixed features for face detection in thermal image

C. Ma, N.T. Trung, H. Uchiyama, H. Nagahara, A. Shimada, R.-I. Taniguchi

Incremental structural modeling on sparse visual SLAM

R. Roberto, H. Uchiyama, J.P. Lima, H. Nagahara, R.-I. Taniguchi, V. Teichrieb

Increasing pose comprehension through augmented reality reenactment

Standard video does not capture the 3D aspect of human motion, which is important for comprehension of motion that may be ambiguous. In …

Fabian Lorenzo Dayrit, 中島悠太, Tomokazu Sato, Naokazu Yokoya

Fine-grained video retrieval for multi-clip video

Mayu Otani, 中島悠太, Esa Rahtu, Janne Heikkilä

Classification of C2C12 cells at differentiation by convolutional neural network of deep learning using phase contrast images

H. Niioka, S. Asatani, A. Yoshimura, H. Ohigashi, S. Tagawa, J. Miyake

High-speed imaging using CMOS image sensor with quasi pixel-wise exposure

Several recent studies in compressive video sensing have realized scene capture beyond the fundamental trade-off limit between spatial …

長原一, Toshiki Sonoda, Kenta Endo, Yukinobu Sugiyama, Rin Ichiro Taniguchi

Dynamic photometric stereo method using multi-tap CMOS image sensor

Photometric stereo enables the estimation of surface normals from images that were captured using different known lighting directions. …

Takuya Yoda, 長原一, Rin Ichiro Taniguchi, Keiichiro Kagawa, Keita Yasutomi, Shoji Kawahito