Now you are here: Home > Publication list > Abstract of Articles [ English ] [ Japanese ]

Abstract of Articles
Please e-mail to akisato <at> eye brl ntt co jp if you would like to download articles listed below.

Ukrit Watchareeruetai, Akisato Kimura, Robert Cheng Bao, Takahito Kawanishi, Kunio Kashino
"Interest point detection via stochastically derived stability,"
IPSJ Transactions on Computer Vision and Applications, Vol.3, pp.189-197, December 2011.
[ PDF (preprint) ] [ DOI link ] [ copyright notice ]

Abstract
We propose a novel framework called StochasticSIFT for detecting interest points (IPs) in video sequences. The proposed framework incorporates a stochastic model considering the temporal dynamics of videos into the SIFT detector to improve robustness against fluctuations inherent to video signals. Instead of detecting IPs and then removing unstable or inconsistent IP candidates, we introduce IP stability derived from a stochastic model of inherent fluctuations to detect more stable IPs. The experimental results show that the proposed IP detector outperforms the SIFT detector in terms of repeatability and matching rates.

Takuho Nakano, Akisato Kimura, Hirokazu Kameoka, Shigeki Sagayama, Shigeki Miyabe, Nobutaka Ono, Kunio Kashino, Takuya Nishimoto
"Automatic video annotation via hierarchical topic trajectory model considering cross-modal correlations,"
Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2011),
pp.2380--2383, Prague, Czech Repiblic, May 2011.
[ pdf ] [ DOI link ] [ poster ] [ copyright notice ]

Abstract
We propose a new statistical model, named Hierarchical Topic Trajectory Model (HTTM), for acquiring a dynamically changing topic model that represents the relationship between video frames and associated text labels. Model parameter estimation, annotation and retrieval can be executed within a unified framework with a few computation. It is also easy to add new modals such as audio signal and geotags. Preliminary experiments on video annotation task with manually annotated video dataset indicate that our proposed method can improve the annotation accuracy.

Jun Takagi, Yasunori Ohishi, Akisato Kimura, Masashi Sugiyama, Makoto Yamada, Hirokazu Kameoka
"Automatic audio tag classification via semi-supervised canonical density estimation,"
Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2011),
pp.2232--2235, Prague, Czech Repiblic, May 2011.
[ pdf ] [ DOI link ] [ poster ] [ copyright notice ]

Abstract
We propose a novel semi-supervised method for building a statistical model that represents the relationship between sounds and text labels (gtagsh). The proposed method, named semi-supervised canonical density estimation, makes use of unlabeled sound data in two ways: 1) a low-dimensional latent space representing topics of sounds is extracted by a semi- supervised variant of canonical correlation analysis, and 2) topic models are learned by multi-class extension of semi- supervised kernel density estimation in the topic space. Real- world audio tagging experiments indicate that our proposed method improves the accuracy even when only a small number of labeled sounds are available.

Akisato Kimura, Hirokazu Kameoka, Kunio Kashino
"Medie Scene Learning: A framework for extracting meaningful parts from audio and video signals",
NTT Technical Review, Vol.8, No.11, pp.1-7, November 2010.
[ PDF ]

Abstract
We describe a novel framework called Media Scene Learning (MSL) for automatically extracting key components such as the sound of a single instrument from a given audio signal or a target object from a given video signal. In particular, we introduce two key methods: 1) the Composite Auto-Regressive System (CARS) for decomposing audio signals into several sound components on the basis of a generative model of sounds and 2) Saliency-Based Image Learning (SBIL) for extracting object-like regions from a given video signal on the basis of the characteristics of the human visual system.

Takuya Maekawa, Akisato Kimura, Hitoshi Sakano
"Wearable sensor device for automatic recording of hand drawings,"
presented in a demo session, Asian Confernece on Computer Vision (ACCV2010),
Queen's Town, New Zealand, November 2010.
[ pdf ] [ DOI link ] [ poster ] [ copyright notice ]

Abstract
Drawing and writing are two of the most important human activities when it comes to recording events and information. Needless to say, the digitization of hand drawn paper is important and thus many products and methods for capturing hand drawings have been developed. However, many of these methods require a special pen, paper, and/or apparatus. Thus, when we want to capture hand drawings with these methods, we have to capture the drawings actively, e.g., by preparing special pens and paper. In this work, we try to capture automatically all the hand drawings found in our daily lives without any explicit action by the user. Recent advances in sensing technology enable us to record our daily life data anywhere and at anytime using small always-on wearable sensors. In this work, our aim is to capture hand drawings automatically with an always-on wearable sensor device equipped with a camera.

Kazuma Akamine, Ken Fukuchi, Akisato Kimura, Shigeru Takagi
"Fully automatic extraction of salient regions in near real-time,"
the Computer Journal, November 2010.
[ DOI link ] [ copyright notice ]

Abstract
Automatic video segmentation plays an important role in a wide range of computer vision and image processing applications. Recently, various methods have been proposed for this purpose. The problem is that most of these methods are far from real-time processing even for low-resolution videos due to the complex procedures. To this end, we propose a new and quite fast method for automatic video segmentation with the help of (1) efficient optimization of Markov random fields with polynomial time of the number of pixels by introducing graph cuts, (2) automatic, computationally efficient but stable derivation of segmentation priors using visual saliency and sequential update mechanism and (3) an implementation strategy in the principle of stream processing with graphics processor units. Test results indicate that our method extracts appropriate regions from videos as precisely as and much faster than previous semi-automatic methods even though no supervisions have been incorporated.

Gurbachan Sekhon, Akisato Kimura, Yasuhiro Minami, Hitoshi Sakano, Eisaku Maeda
"Action planning for interactive visual scene understanding based on knowledge confidence in latent spaces,"
IEICE Technical Report (domestic),
PRMU2010-83 (IBISML2010-5), Fukuoka, Japan, September 2010.
[ presentation material ] [ copyright notice ]

Abstract
This report proposes a method for action planning in interactive visual scene understanding through the use of knowledge confidence generated from a latent space of a topic model connecting image features and text labels. We then use information, within the latent space, about the position of an input sample relative to training samples in order to simulate knowledge confidence. Coupled with this, we also use the overall associativity between each text label as determined by the content of the training samples to determine the knowledge confidence.

Akisato Kimura, Hirokazu Kameoka, Masashi Sugiyama, Takuho Nakano, Eisaku Maeda, Hitoshi Sakano, Katsuhiko Ishiguro
"SemiCCA: Efficient semi-supervised learning of canonical correlations,"
Proc. IAPR International Conference on Pattern Recognition (ICPR2010),
pp.2933--2936, Istanbul, Turkey, August 2010.
[ pdf ] [ DOI link ] [ poster ] [ copyright notice ]

Abstract
Canonical correlation analysis (CCA) is a powerful tool for analyzing multi-dimensional paired data. However, CCA tends to perform poorly when the number of paired samples is limited, which is often the case in practice. To cope with this problem, we propose a semi-supervised variant of CCA named semiCCA that allows us to incorporate additional unpaired samples for mitigating over-fittng. The proposed method smoothly bridges the eigenvalue problems of CCA and principal component analysis (PCA), and thus its solution can be computed efficiently just by solving a single eigenvalue problem as the original CCA.

Ukrit Watchareeruetai, Akisato Kimura, Robert Cheng Bao, Takahito Kawanisi, Kunio Kashino
"StochasticSIFT: Interest point detection based on stochastically- derived stability,"
Proc. Meeting on Image Recognition and Understanding (MIRU2010, domestic),
IS1-80, Kushiro, Hokkaido, Japan, July 2010.
[ PDF ] [ Poster ]
[ copyright notice: The authors hold the copyritht of the material. ]

Abstract
We propose a novel framework for detecting interest points (IPs) in video sequence named "StochasticSIFT". The proposed framework incorporates a stochastic model considering temporal dynamics of videos into the SIFT detector to improve robustness against some fluctuations inherently included in video signals. Instead of detecting IPs followed by removing unstable or inconsistent IP candidates, we introduce IP "stability" derived from a stochastic model of inherent repeat ability and matching rates.

Gurbachan Sekhon, Ken Fukuchi, Akisato Kimura
"Automatic and precise extraction of generic objects using saliency-based priors and contour constraints,"
Proc. Meeting on Image Recognition and Understanding (MIRU2010, domestic),
IS3-3, Kushiro, Hokkaido, Japan, July 2010.
[ PDF ] [ Poster ]
[ copyright notice: The authors hold the copyritht of the material. ]

Abstract
This paper deals with automatic video segmentation without supervision or interactions. We examine a method for automatic noise reduction in segmented video frames utilizing contour information, which we have dubbed the Contour-Classification method. This method uses information about the contours of the segmented image mask in order to accurately reduce noise in segmented video frames. We will also examine which we have developed, called the Erosion-Dilation method. Our proposed method is then composed of these two fundamental techniques: Contour-Classification and Erosion-Dilation. Test results indicate our proposed method precisely removes noise regions from videos with low error rate when compared with both the original unaltered segmentation result and the Erosion-Dilation method.

Shigeaki Kuzuoka, Akisato Kimura, Tomohiko Uyematsu
"Universal source coding for multiple decoders with side information,"
Proc. International Symposium of Information Theory (ISIT2010),
pp.1-5, Austin Texas, USA, June 2010.

Abstract
A multiterminal lossy source coding problem, which includes various problems such as the Wyner-Ziv problem and the complementary delivery problem as special cases, is considered. It is shown that any point in the achievable ratedistortion region can be attained even if the source statistics are not known.

Akisato Kimura, Derek Pang, Tatsuto Takeuchi, Kouji Miyazato, Junji Yamato and Kunio Kashino
"A stochastic model of human visual attention with a dynamic Bayesian network,"
submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence. [ pdf (arXiv.org) ] [ DOI link ] [ copyright notice ]

Abstract
Recent studies in the field of human vision science suggest that the human responses to the stimuli on a visual display are non-deterministic. People may attend to different locations on the same visual input at the same time. Based on this knowledge, we propose a new stochastic model of visual attention by introducing a dynamic Bayesian network to predict the likelihood of where humans typically focus on a video scene. The proposed model is composed of a dynamic Bayesian network with 4 layers. Our model provides a framework that simulates and combines the visual saliency response and the cognitive state of a person to estimate the most probable attended regions. Sample-based inference with Markov chain Monte-Carlo based particle filter and stream processing with multi-core processors enable us to estimate human visual attention in near real time. Experimental results have demonstrated that our model performs significantly better in predicting human visual attention compared to the previous deterministic models.

Shigeaki Kuzuoka, Akisato Kimura, Tomohiko Uyematsu
"Universal source coding for multiple decoders with side information,"
Workshop on Shannon Theory Workshop (STW2009, domestic),
pp.35-40, Matsuyama, Ehime, Japan, September 2009.

Abstract
A multiterminal lossy source coding problem, which includes various problems such as the Wyner-Ziv problem and the complementary delivery problem as a special case, is considered. It is clarified that any point in the achievable rate-distortion region can be attained even if the source statistics is not known.

Ken Fukuchi, Kouji Miyazato, Akisato Kimura, Shigeru Takagi and Junji Yamato
"Saliency-based video segmentation with graph cuts and sequentially updated priors,"
Proc. International Conference on Multimedia and Expo (ICME2009),
New York, New York, USA, June-July 2009.
[ pdf ] [ DOI link ] [ poster ] [ copyright notice ]

Abstract
This paper proposes a new method for achieving precise video segmentation without any supervision or interaction. The main contributions of this report include 1) the introduction of fully automatic segmentation based on the maximum a posteriori (MAP) estimation of the Markov random field (MRF) with graph cuts and saliency-driven priors and 2) the updating of priors and feature likelihoods by integrating the previous segmentation results and the currently estimated saliency-based visual attention. Test results indicate that our new method precisely extracts probable regions from videos without any supervised interactions.

Kouji Miyazato, Akisato Kimura, Shigeru Takagi and Junji Yamato
"Real-time estimation of human visual attention with dynamic Bayesian network and MCMC-based particle filter",
Proc. International Conference on Multimedia and Expo (ICME2009),
New York, New York, USA, June-July 2009.
[ pdf ] [ DOI link ] [ presentation material ] [ copyright notice ]

Abstract
Recent studies in signal detection theory suggest that the human responses to the stimuli on a visual display are non-deterministic. People may attend to different locations on the same visual input at the same time. Constructing a stochastic model of human visual attention would be promising to tackle the above problem. This paper proposes a new method to achieve a quick and precise estimation of human visual attention based on our previous stochastic model with a dynamic Bayesian network. A particle filter with Markov chain Monte-Carlo (MCMC) sampling make it possible to achieve a quick and precise estimation through stream processing. Experimental results indicate that the proposed method can estimate human visual attention in real time and more precisely than previous methods.

Akisato Kimura, Derek Pang, Tatsuto Takeuchi, Junji Yamato and Kunio Kashino
"Dynamic Markov random fields for stochastic modeling of visual attention",
IEICE Technical Report (domestic),
PRMU2008-117 (MVE2008-66), Toyonaka, Osaka, Japan, November 2008.
[ presentation material ] [ copyright notice ]

Abstract
This report proposes a new stochastic model of visual attention to predict the likelihood of where humans typically focus on a video scene. The proposed model is composed of a dynamic Bayesian network that simulates and combines a person's visual saliency response and eye movement patterns to estimate the most probable regions of attention. Dynamic Markov random field (MRF) models are newly introduced to include spatiotemporal relationships of visual saliency responses. Experimental results have revealed that the proposed model outperforms the previous deterministic model and the stochastic model without dynamic MRF in predicting human visual attention.

Akisato Kimura,
"Particle-based simulation of the Gel'fand-Pinsker channel capacity and the Wyner-Ziv rate-distortion function,"
Proc. Symposium on Information Theory and its Applications (SITA2008, domestic),
pp.2-4-4, Kinugawa, Tochigi, Japan, October 2008.
[ presentation material ] [ copyright notice ] [ pdf ] [ presentation material ]
[ copyright notice: The authors hold the copyright of the material.]

Abstract
This report presents a new numerical algorithm for simulating the capacity of a memoryless channel with non-causal encoder side information (the Gel'fand-Pinsker channel) and the rate-distortion function for a memoryless source with decoder side information (the Wyner-Ziv coding). The basic idea is to represent a probabilistic density by a finite number of particles each of which is composed of a sample value and the associated weight. The proposed algorithm enables us to simulate the channel capacity and the rate distortion function with infinite or continuous alphabets.

Shigeaki Kuzuoka, Akisato Kimura and Tomohiko Uyematsu,
"Simple coding schemes for lossless and lossy complementary delivery problems,"
Proc. Shannon Theory Workshop (STW2007, domestic),
pp.43-50, Izu, Shizuoka, Japan, September 2007.

Abstract
This paper deals with a coding problem called complementary delivery, where messages from two correlated sources are jointly encoded, and each decoder reproduces one of two messages using the other message as the side information. Simple lossless and lossy complementary delivery coding schemes are proposed. In the lossless case, it is revealed that the error probability of the proposed code based on Slepian-Wolf codes is exponentially tight. Moreover, in the lossy case, it is demonstrated that Wyner-Ziv codes can be applied to complementary delivery problem.

Kunio Kashino, Akisato Kimura, Takayuki Kurozumi and Hidehisa Nagano
"Robust search methods for music signals based on simple representation,"
Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP2007),
Vol.4, pp.1421--1424, Hawaii, USA, April 2007.
[ DOI link ] [ copyright notice ]

Abstract
Signal similarity search is an important technique for music information retrieval. A basic task is finding identical signal segments on unlabeled music-signal archives, given a short music signal fragment as a query. In such a task, the search must be fast and sufficiently robust against possible signal fluctuations due to noise and distortions. In this special session paper, we describe a search method designed to cope with additive interfering sounds by spectral partitioning. Then, we introduce another method designed to be robust under multiplicative noise or distortion based on binary area representation.

Takahito Kawanishi, Masaru Tsuchida, Shigeru Takagi, Akisato Kimura and Junji Yamato
"Small cylindrical display using asherical mirror for anthropomorphic agents",
Proc. International Display Workshop / Asia Display (IDW/AD'05),
pp.1755-1758, Takamatsu, Kagawa, Japan, December 2005.

Abstract
We have developed a small cylindrical display for an anthropomorphic agent that communicates with mul-tiple users in a 3D environment. The previously developed cylindrical display was dark with bad contrast at the lower part of the screen because the density of pixels at the lower part is much less than at the upper part. We improved that the pixel density is uniform using aspherical mirror. Experimental results show our new display has better luminance and better contrast than previous one.

Kunio Kashino, Akisato Kimura and Takayuki Kurozumi
"A quick video search method based on local and global feature pruning",
Proc. International Conference on Pattern Recognition (ICPR2004)’”¤
Vol.3, pp.894-897, August 2004.
[ DOI link ] [ copyright notice ]

Abstract
This paper proposes a quick method of similarity-based video searching to detect and locate a specific video clip given as a query in a stored long video stream. The method employs a two-stage process: local and global feature clustering. The local clustering exploits continuity or local similarities between video features, and the global clustering gathers similar video frames that are not necessarily adjacent to each other. These processes prune irrelevant sections on a video stream. The method guarantees the exactly same search result as the exhaustive search. Experiments performed on a PC show that the proposed method can correctly detect and locate a 7.5-second clip in a 150-hour video recording in 15 ms on average.

Akisato Kimura, Derek Pang, Tatsuto Takeuchi, Junji Yamato and Kunio Kashino
"Dynamic Markov random fields for stochastic modeling of visual attention",
Proc. International Conference on Pattern Recognition (ICPR2008),
Mo.BT8.35, Tampa, Florida, USA, December 2008.
[ pdf ] [ DOI link ] [ poster ] [ copyright notice ]

Abstract
This report proposes a new stochastic model of visual attention to predict the likelihood of where humans typically focus on a video scene. The proposed model is composed of a dynamic Bayesian network that similates and combines a person's visual saliency response and eye movement patterns to estimate the most probable regions of attention. Dynamic Markov random field (MRF) models are newly introduced to include spatiotemporal relationships of visual saliency responses. Experimental results have revealed that the propose model outperforms the previous deterministic model and the stochastic model without dynamic MRF in predicting human visual attention.

Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian network",
Proc. Meeting on Image Recognition and Understanding (MIRU2008, domestic),
pp. 1500--1505, Karuizawa, Nagano, Japan, July 2008.
(Selected as Best Interactive Session Award )
[ pdf ] [ digest ] [ poster: Japanese, English ]
[ copyright notice: The authors hold the copyritht of the material. ]

Abstract
(The content is almost the same as the one presented in IEICE Technical Meeting held in June 2008. Please see here.)

Shigeaki Kuzuoka, Akisato Kimura and Tomohiko Uyematsu
"Universal coding for lossy complementary delivery problems",
Proc. International Symposium on Information Theory (ISIT2008),
pp. 2177--2188, Toronto, Canada, July 2008.
[ DOI link ] [ copyright notice ]

Abstract
This paper deals with a universal lossy coding problem for a certain kind of multiterminal source coding network called a complementary delivery system. A universal coding scheme based on Wyner-Ziv codes is proposed. While the proposed scheme cannot attain the optimal rate-distortion trade-off in general, the rate-loss is upper bounded by a universal constant under some mild conditions. Moreover, the proposed scheme allows us to apply (non-universal) Wyner-Ziv codes to construct a universal lossy complementary delivery code.

Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian network",
Proc. International Conference on Multimedia and Expo (ICME2008),
pp.1073--1076, Hannover, Germany, June 2008.
[ pdf ] [ DOI link ] [ presentation material ] [ copyright notice ]

Abstract
Recent studies in signal detection theory suggest that the human responses to the stimuli on a visual display are non-deterministic. People may attend to different locations on the same visual input at the same time. To predict the likelihood of where humans typically focus on a video scene, we propose a new stochastic model of visual attention by introducing a dynamic Bayesian network. Our model simulates and combines the visual saliency response and the cognitive state of a person to estimate the most probable attended regions. Experimental results have demonstrated that our model performs significantly better in predicting human visual attention compared to the previous deterministic model.

Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian network",
IEICE Technical Report (domestic),
PRMU2008-43 (DE2008-25), Otaru, Hokkaido, Japan, June 2008.
[ PDF ] [ presentation material ] [ copyright notice ]

Abstract
Recent studies in signal detection theory suggest that the human responses to the stimuli on a visual display are non-deterministic. People may attend to different locations on the same visual input at the same time. To predict the likelihood of where humans typically focus on a video scene, we propose a new stochastic model of visual attention by introducing a dynamic Bayesian network. Our model simulates and combines the visual saliency response and the cognitive state of a person to estimate the most probable attended regions. Experimental results have demonstrated that our model performs significantly better in predicting human visual attention compared to the previous deterministic model.

Akisato Kimura, Tomohiko Uyematsu, Shigeaki Kuzuoka and Shun Watanabe,
"Universal source coding over generalized complementary delivery networks,"
IEEE Transactions on Information Theory,
Vol.55, No.3, pp.1360-1373, March 2009.
[ pdf ] [ DOI link ] [ copyright notice ]

Abstract
This paper deals with a universal coding problem for a certain kind of multiterminal source coding networks called a generalized complementary delivery network. In this network, messages from multiple correlated sources are jointly encoded, and each decoder has access to some of the messages to enable the decoder to reproduce the other messages. Both fixed-to-fixed length and fixed-to-variable length lossless coding schemes are considered. Explicit constructions of universal codes and the bounds of the error probabilities are clarified via methods of types and graph-theoretical analysis.

Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources over generalized complementary delivery networks,"
Proc. Symposium on Information Theory and its Applications (SITA2007, domestic),
pp.274-279, Shima, Mie, November 2007.
[ pdf ] [ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]

Abstract
This paper deals with a universal coding problem for a certain kind of multiterminal source coding networks called a generalized complementary delivery network. In this network, messages from multiple correlated sources are jointly encoded, and each decoder has access to some of the messages to enable the decoder to reproduce the other messages. Both fixed-to-fixed length and fixed-to-variable length lossless coding schemes are considered. Explicit constructions of universal codes and the bounds of the error probabilities are clarified via methods of types and graph-theoretical analysis.

Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for audio signals based on a piecewise linear representation of feature trajectories,"
IEEE Transactions on Audio, Speech and Language Processing,
Vol.16, No.2, pp.396-407, February 2008.
[ pdf ] [ DOI link ] [ copyright notice ]

Abstract
This paper presents a new method for a quick similarity-based search through long unlabeled audio streams to detect and locate audio clips provided by users. The method involves feature dimension reduction based on a piecewise linear representation of a sequential feature trajectory extracted from a long audio stream. Two techniques enable us to obtain a piecewise linear representation: the dynamic segmentation of feature trajectories and the segment-based KL transform. A new technique is also introduced that greatly reduces the required feature comparisons. The proposed search method guarantees in principle that no segment to be detected is missed. Experiments indicate significant improvements in search speed. For example the proposed method reduced the total search time to approximately 1/12 and detected queries in approximately 0.3 seconds from a 200-hour audio database.

Akisato Kimura,
"Coding theorems for correlated sources with cooperative encoders,"
Ph.D dissertation, Tokyo Institute of Technology, September 2007.
[ pdf ] [ presentation material ]
[ Copyright notice: The author holds the copyright of the material. ]

Abstract
This thesis deals with multiterminal source coding problems for a general framework of coding systems, called coding systems with cooperation, where there are some linkages among encoders and decoders. Especially, the main focus of this thesis is encoder cooperation. Two types of coding systems are investigated that incorporate encoder cooperation: the Slepian-Wolf coding system with linkages (called the SWL system) and the complementary delivery coding system.
The SWL system involves some mutual linkages between two encoders of the coding system investigated by Slepian and Wolf (called the SW system) that involves two separate encoders and one common decoder. Especially, some special cases are considered, where the coding rate for the mutual linkage between two encoders is negligibly small. The main results in this thesis shows that the achievable rate region of the SWL system equals that of the SW system when considering fixed-length coding, while weak variable-length coding makes the achievable rate region of the SWL system larger than that of the SW system. This implies that encoder cooperation may improve the coding rate.
The complementary delivery coding system contrasts with the SW system in the sense of cooperation, which means that the complementary delivery coding system consists of a common encoder and separate decoders, while the SW system includes separate encoders and a common decoder. Especially, in the complementary delivery coding system, each decoder has access to some of encoded messages to enable the decoder to reproduce the other messages from a common codeword emitted from the common encoder. First, the minimum achievable rate for lossy coding is clarified, which implies that encoder cooperation may increase the coding rate. Next, universal coding schemes for lossless coding are proposed. Explicit constructions of universal lossless codes and the bounds of the error probabilities are clarified by using methods of types and the graph-theoretical analysis.

Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka ,
"Universal coding for correlated sources with complementary delivery,"
IEICE Transactions on Fundamentals,
Vol.E90-A, No.9, pp.1840-1847, September 2007.
Pulished online in IEICE Transaction Online.
[ pdf ] [ DOI link ] [ copyright notice ]

Abstract
This paper deals with a universal coding problem for a certain kind of multiterminal source coding system that we call the complementary delivery coding system. In this system, messages from two correlated sources are jointly encoded, and each decoder has access to one of the two messages to enable it to reproduce the other message. Both fixed-to-fixed length and fixed-to-variable length lossless coding schemes are considered. Explicit constructions of universal codes and bounds of the error probabilities are clarified via type-theoretical and graph-theoretical analyses.

Clement Leung, Akisato Kimura, Tatsuto Takeuchi and Kunio Kashino
"A computational model of saliency depletion/recovery phenomena for the salient region extraction of videos",
Proc. Meeting on Image Recognition and Understanding (MIRU2007, domestic),
pp.582--587, Hiroshima, Japan, July 2007.
[ pdf ] [ poster ] [ copyright notice ]

Abstract
This report proposes a new algorithm for extracting salient regions of videos by introducing two important properties of the early human visual system: 1) Instantaneous saliency depletion with gradual recovery, whereby saliency is instantaneously suppressed and gradually recovered in previously attended regions. 2) Gradual saliency depletion with instantaneous recovery, whereby saliency is gradually decreased over time in non-surprising regions and at the same time recovered in surprising locations. With the introduction of these properties, redundant information in videos can be suppressed and important information is eventually enhanced. The proposed algorithm has been evaluated with an eye tracking device to see how well it fits the human visual system. The results show that the proposed algorithm substantially outperformed previous algorithms when only gradual depletion was incorporated, and instantaneous depletion improved the performance in some cases.

Clement Leung, Akisato Kimura, Tatsuto Takeuchi and Kunio Kashino
"A computational model of saliency depletion/recovery phenomena for the salient region extraction of videos",
Proc. International Conference on Multimedia and Expo (ICME2007),
pp.300--303, Beijing, China, July 2007.
[ pdf ] [ DOI link ] [ presentation material ] [ copyright notice ]

Abstract
This paper proposes a new algorithm for extracting salient regions of videos by introducing two important properties of the early human visual system: 1) Instantaneous saliency depletion with gradual recovery, whereby saliency is insantaneously suppressed and gradually recovered in previously attended regions. 2) Gradual saliency depletion with instantaneous recovery, whereby saliency is gradually decreased over time in non-surprising regions and at the same time recovered in surprising locations. With the introduction of these properties, redundant information in videos can be suppressed and important information is eventually enhanced.

Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources with generalized complementary delivery,"
presented at a recent result session,
International Symposium on Information Theory (ISIT2007),
Nice, France, June 2007.
[ poster ] [ copyright notice ]

Abstract
This presentation deals with a universal coding problem for a certain kind of multiterminal source coding system called the generalized complementary delivery coding system. In this system, messages from multiple correlated sources are jointly encoded, and each decoder has access to some of the messages to enable them to reproduce the other messages. Both fixed-to-fixed length and fixed-to-variable length lossless coding schemes are considered. Explicit constructions of universal codes and the bounds of the error probabilities are clarified via methods of types and graph-theoretical analyses.

Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources with complementary delivery,"
Proc. International Symposium on Information Theory (ISIT2007),
pp.1756--1760, Nice, France, June 2007.
[ pdf ] [ DOI link ] [ presentation material ] [ copyright notice ]

Abstract
This report deals with a universal coding problem for a certain kind of multiterminal source coding system that we call the complementary delivery coding system. Both fixed-to-fixed length and fixed-to-variable length lossless coding schemes are considered. Explicit constructions of universal codes and the bounds of the error probabilities are clarified via type-theoretical and graph-theoretical analyses.

Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka ,
"Universal source coding for complementary delivery,"
Proc. Symposium on Information Theory and its Applications (SITA2006, domestic),
pp.803--806, Hakodate, Japan, November-December 2006.
[ pdf ] [ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]

Abstract
This paper deals with a universal coding problem for a certain kind of multiterminal source coding system that we call complementary delivery coding system. Both fixed-to-fixed length and fixed-to-variable length lossless coding schemes are considered. Explicit constructions of universal codes and bounds of the error probabilities are alarified via type-theoretical and graph-theoretical analyses.

Akisato Kimura and Tomohiko Uyematsu ,
"Information-theoretical analysis of index searching: Revised,"
Proc. Symposium on Information Theory and its Applications (SITA2006, domestic),
pp.73--76, Hakodate, Japan, November-December 2006.
[ pdf ] [ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]

Abstract
We present an information-theoretical viewpoint for similarity-based retrieval along with index structures. This retrieval system comprises two stages: pruning data items based on the index structures, and matching surviving data items. The first stage is modeled as so-called Wyner-Ziv problem, while the second stage is considered as a coding problem such that parts of the decoding results are available as partial side information at both of the encoder and decoder. We clarify upper and lower bounds of the optimal retrieval performances and some relationships between retrieval parameters and performances via shannon-theoretic analyses.

Akisato Kimura and Tomohiko Uyematsu ,
"Multiterminal source coding with complementary delivery,"
Proc. International Symposium on Information Theory and its Applications (ISITA2006),
pp.189-194, Seoul, South Korea, Octover 2006.
[ pdf ] [ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]

Abstract
A coding problem where messages from two correlated sources are jointly encoded and separately decoded is investigated. Each decoder has access to one of the two messages to enable it to reproduce the other message. The rate-distortion function for lossy coding is clarified. Some related coding problems are also examined.

Akisato Kimura, Tomohiko Uyematsu
"Multiterminal source coding for cascading and feedback refinement systems,"
Prof. Shannon Theory Workshop (STW2006, domestic),
pp.25-31, September 2006
[ pdf ] [ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]

Abstract
Lossy coding problems are investigated for some communication systems in the presense of cascading and/or feedback information channels from decoders so as to refine reproduction messages. This framework provides different types of refinement structures from so-called successive refinement. Three different types of communication systems are considered, i.e. refinement systems in the presense of a cascading channel, a feedback channel, and both channels. Outer and inner bounds of achievable rate-distortion regions for those problems are obtained.

Akisato Kimura and Tomohiko Uyemats ,
"Multiterminal source coding with complementary delivering,"
IEICE Technical Report,
IT2006-8, pp.7-12, May 2006,
Presented at 2006 Hawaii, IEICE and SITA Joint Conference on Information Theory.
[ presentation material ] [ copyright notice ]

Abstract
We consider a coding problem where messages from two correlated sources are jointly encoded and separately decoded. Each decoder has access to one of two messages to reproduce the other message. We clarify the rate-distortion function for lossy coding.

Akisato Kimura, Takahito Kawanishi and Kunio Kashino,
"Acceleration of similarity-based partial image retrieval using multistage vector quantization,"
Proc. International Conference on Pattern Recognition (ICPR2004),
Vol.2, pp.993-996, Cambridge, United Kingdom, August 2004.
[ pdf ] [ DOI link ] [ poster ] [ copyrigthe notice ]

Abstract
We propose a new method for quick and accurate partial image retrieval from a huge number of images based on a predefined distance measure. The proposed method utilizes vector quantization (VQ) on multiple layers, namely color, block, and feature layers. This can greatly reduce the amount of calculation needed for partial image retrieval. Experiments indicate that the proposed method can detect partial images that are similar to queries through 1000 images within 4 seconds. This is approximately 30 times faster than the method to which multistage VQ is not applied.

Akisato Kimura, Takahito Kawanishi and Kunio Kashino,
"Similarity-based partial image retrieval guaranteeing same accuracy as exhaustive matching,"
Proc. International Conference on Multimedia and Expo (ICME2004),
Vol. 3, pp.1895-1898, Taipei, Taiwan, June 2004.
[ pdf ] [ poster ] [ copyright notice ]

Abstract
We propose a new framework for quick and accurate partial image retrieval from a huge number of images based on a predefined distance measure. Finding partial similarities generally requires a huge amount of storage space for indexes due to the large number of portions of images. The proposed method extracts portions from each database image at a constant spacing, while it extracts all possible portions from a query image. In this way, the proposed method can greatly reduce the size of indexes while theoretically guaranteeing the same accuracy as exhaustive matching.

Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-length Slepian-Wolf coding with linked encoders for mixed sources,"
IEEE Transactions on Information Theory,
Vol.50, No.1, pp.183-193, Jan. 2004.
[ pdf ] [ DOI link ] [ copyright notice ]

Abstract
Coding problems for correlated information sources were first investigated by Slepian and Wolf. They considered the data compression system, called the SW system, where two sequences emitted from correlated sources are separately encoded to codewords, and sent to a single decoder which has to output the original sequence pairs with a small probability of error. In this correspondence, we investigate the coding problem of a modified SW system allowing two encoders to communicate with zero rate. First, we consider the fixed-length coding and clarify that the admissible rate region for general sources is equal to that of the original SW system. Next, we investigate the variable-length coding having the asymptotically vanishing probability of error. We clarify the admissible rate region for mixed sources characterized by two ergodic sources and show that this region is strictly wider than that for fixed-length codes. Further, we investigate the universal coding problem for memoryless sources in the sysyem and show that the SW system with linked encoders has much more flexibility than the original SW system.

Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for multimedia signals using global pruning,"
Systems and Computers in Japan,
Vol.34, No.13, pp.47-58, November 2003.
[ DOI link ]

Abstract
The authors propose a new method for quickly searching for a specific audio or video signal to be detected within a long, stored audio or video stream to determine segments that contain signals that are nearly identical to the given signal. The Time-series Active Search (TAS) method is one of the quick search methods that have been proposed previously. This singal searching technique based on histograms extracted from the signals had implemented quick searching by local pruning, that is, omitting comparisons of segmentsfor which searching was unnecessary based on similarities in the vicinity of the matching window. In contrast, the proposed technique implements significantly quicker searching by introducing global pruning, which looks at the entire signal time-series according to histogram classifications based on similarities of the entire signal to eliminate segments that need not be searched, in addition to local pruning. In this paper, the authors present a detailed discussion of the relationship between the degree of global pruning and the accuracy that is guaranteed. For example, the authors showed through experimentsthat when 128-dimensions histograms were classified to 1024 clusters, the proposed technique achieved a search speed approximately 9 times that of TAS while preserving the same degree of accuracy. The preprocessing calculation time increased by approximately 1% of the time for playing the signal.

Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"Dynamic-segmentation-based feature dimension reduction for quick audio/video searching,"
Proc. International Conference on Multimedia and Expo (ICME2003),
Vol.2, pp.389-392, Baltimore, Maryland, USA, July 2003.
Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP2003),
Vol.3, pp.357-360, Hong Kong, Apr. 2003 (cancelled).
[ pdf ] [ DOI link ] [ poster ] [ copyright notice ]

Abstract
We propose a new feature dimension reduction method for multimedia search. The main technique in the method is dynamic segmentation that partitions sequential feature trajectories dynamically. While dynamic segmentation reduces the average dimensionality and accelerates the search, it requires huge amount of calculation. Thus, our method quickly executes suboptimal partitioning of the trajectories by using the discreteness of dimension changes. This guarantees the optimal amount of calculation to derive the suboptimal partitioning under the condition that the dimension monotonously increases as the segment length increases. The experiment shows that our method is over 10 times faster than a straightforward dynamic segmentation method.

Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for multimedia signals using feature compression based on piecewise linear maps,"
Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP2002),
Vol.4, pp.3656-3659, Orlando, Florida, USA, May 2002.
[ pdf ] [ DOI link ] [ poster ] [ copyright notice ]

Abstract
We propose a quick algorithm for multimedia signal search. The algorithm comprises two techniques: feature compression based on piecewise linear maps and distance bounding to efficiently limit the search space. When compared with existing multimedia search techniques, they greatly reduce the computational cost required in searching. Although feature compression is employed in our method, our bounding technique mathematically guarantees the same recall rate as the search based on the original features; no segment to be detected is missed. Experiments indicate that the proposed algorithm is approximately 10 times faster than and as accurate as an existing fast method maitaining the same search accuracy.

Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-lenth Slepian-Wolf coding with linked encoders for mixed source,"
Proc. IEEE Information Theory Workshop 2001 (ITW2001),
pp.82--84, Cairns, Australia, Sep. 2001
[ pdf ] [ DOI link ] [ Copyright notice ]

Abstract
Slepian and Wolf first considered the data compression of correlated sources called the SW system, where two sequences emitted from correlated sources are separately encoded to codewords, and sent to a single decoder which has to output original sequence pairs. Resently, Oohama has extended the SW system and investigated a more general case where there are come mutual linkages between two encoders of the SW system. In this papar, we investigate variable-length coding which allows asymptotically vanishing probability of error for the system considered by Oohama. We clarify the admissible rate region for mixed sources, and show that this region is strictly wider than that for fixed-length codes.

Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"Very quick audio searching : Introducing global pruning to the Time-Series Active Search,"
Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP2001),
Vol.3, pp.1429-1432, Salt Lake City, Utah, USA, May 2001.
[ pdf ] [ DOI link ] [ presentation material ] [ copyright notice ]

Abstract
Previously, we proposed a histogram-based quick signal search method called Time-Series Active Search (TAS). TAS is a method of searching through long audio or video recordings for a specified segment, based on signal similarity. TAS is fast; it can search through a 24-hour recording in 1 second after a query-independent preprocessing. However, an even faster method is required when we consider huge amount of audio archives, for example a month's worth of recordings. Thus, we propose a preprocessing method that significantly accelerates TAS. The core part of this method comprises a global histogram clustering of long signals and a pruning scheme using those clusters. Tests using broadcast recording indicate that the proposed algorithm achieves the search speed approximately 3 to 30 times faster than TAS. In these tests, the search results are exactly the same as with TAS.

Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-length Slepian-Wolf coding with linked encoders for mixed sources,"
IEICE Technical Report,
IT99-59, pp.7-12, Jan. 2000.
[ copyright notice ]

Abstract
Coding problems for correlated information sources were first investigated by Slepian and Wolf, where sequences from two correlated sources are separately encoded, sent to a single decoder and decoded with sufficiently small probability of error. We investigate the coding theorem for correlated two sources, where there are some mutual linkages between two encoders of the coding system proposed by Slepian and Wolf. We consider weak variable-length coding, i.e. variable-length code having vanishing error, and show the achievable rate region for mixed sources characterized by two ergodic sources.

Akisato Kimura and Tomohiko Uyematsu,
"Large deviations performance of interval algorithm for random number generation,"
Proc. Memorial workshop for the 50th anniversary of the Shannon theory,
pp.1-4, Yamanashi, Japan, Jan. 1999
[ pdf ] [ copyright notice: The authors hold the copyritht of the material. ]

Abstract
We investigate large deviations performance of the interval algorithm for random number generation, especially for intrinsic randomness. First, we show that the length of output fair random bits per the length of input sequence approaches to the entropy of the source almost surely. Next, we consider to obtain the fixed number of fair random bits from the input sequence with fixed length. We show that the approximation error measured by the variational distance and divergence vanishes exponentially as the length of input sequence tends to infinity, if the number of fair bits per input sample is below the entropy of the source. Contrarily, the approximation error measureby the variational distance approaches to two exponentially, if the number of fair bits per input sample is above the entropy.

Nobukazu Takai, Akisato Kimura and Nobuo Fujii,
"CMOS FET companding current-mode integrator,"
Proc. IEEE Asia-Pacific Conference on Circuit and Systems (APCCS98),
pp.17-20, Chiangmai, Thailand, Nov. 1998
[ pdf ] [ DOI link ] [ copyright notice ]

Abstract
A new CMOS companding current-mode integrator is proposed. The companding integrator is based on MOS TransLinear principle and utilizes a nature of MOSFET square-law. SPICE simulation results demonstrate good performances.


IEEE Copyright Notice

©2001-2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposed or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

These materials are presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to ashere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.


Last Modified: