|
|
Abstract of Articles
Please e-mail to akisato <at> eye brl ntt co jp
if you would like to download articles listed below.
Ukrit Watchareeruetai, Akisato
Kimura, Robert Cheng Bao, Takahito Kawanishi, Kunio
Kashino
"Interest point detection via stochastically derived stability,"
IPSJ Transactions on Computer Vision and Applications,
Vol.3, pp.189-197, December 2011.
[ PDF (preprint) ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
We propose a novel framework called StochasticSIFT for detecting interest
points (IPs) in video sequences. The proposed framework incorporates a
stochastic model considering the temporal dynamics of videos into the
SIFT detector to improve robustness against fluctuations inherent to
video signals. Instead of detecting IPs and then removing unstable or
inconsistent IP candidates, we introduce IP stability derived from a
stochastic model of inherent fluctuations to detect more stable IPs. The
experimental results show that the proposed IP detector outperforms the
SIFT detector in terms of repeatability and matching rates.
Takuho Nakano, Akisato Kimura,
Hirokazu Kameoka, Shigeki Sagayama, Shigeki Miyabe, Nobutaka Ono,
Kunio Kashino, Takuya Nishimoto
"Automatic video annotation via hierarchical topic trajectory model
considering cross-modal correlations,"
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP2011),
pp.2380--2383, Prague, Czech Repiblic, May 2011.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
We propose a new statistical model, named Hierarchical Topic
Trajectory Model (HTTM), for acquiring a dynamically changing
topic model that represents the relationship between video
frames and associated text labels. Model parameter estimation,
annotation and retrieval can be executed within a unified
framework with a few computation. It is also easy to add new
modals such as audio signal and geotags. Preliminary experiments
on video annotation task with manually annotated video dataset
indicate that our proposed method can improve the annotation
accuracy.
Jun Takagi, Yasunori Ohishi, Akisato Kimura,
Masashi Sugiyama, Makoto Yamada, Hirokazu Kameoka
"Automatic audio tag classification via semi-supervised canonical density
estimation,"
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP2011),
pp.2232--2235, Prague, Czech Repiblic, May 2011.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
We propose a novel semi-supervised method for building a
statistical model that represents the relationship between
sounds and text labels (gtagsh). The proposed method, named
semi-supervised canonical density estimation, makes use of
unlabeled sound data in two ways: 1) a low-dimensional latent
space representing topics of sounds is extracted by a semi-
supervised variant of canonical correlation analysis, and 2)
topic models are learned by multi-class extension of semi-
supervised kernel density estimation in the topic space. Real-
world audio tagging experiments indicate that our proposed
method improves the accuracy even when only a small number of
labeled sounds are available.
Akisato Kimura, Hirokazu Kameoka,
Kunio Kashino
"Medie Scene Learning: A framework for extracting meaningful
parts from audio and video signals",
NTT Technical Review,
Vol.8, No.11, pp.1-7, November 2010.
[ PDF ]
- Abstract
-
We describe a novel framework called Media Scene Learning (MSL)
for automatically extracting key components such as the sound of
a single instrument from a given audio signal or a target object
from a given video signal. In particular, we introduce two key
methods: 1) the Composite Auto-Regressive System (CARS) for
decomposing audio signals into several sound components on the
basis of a generative model of sounds and 2) Saliency-Based
Image Learning (SBIL) for extracting object-like regions from a
given video signal on the basis of the characteristics of the
human visual system.
Takuya Maekawa, Akisato Kimura,
Hitoshi Sakano
"Wearable sensor device for automatic recording of hand
drawings,"
presented in a demo session,
Asian Confernece on Computer Vision (ACCV2010),
Queen's Town, New Zealand, November 2010.
[ pdf ] [ DOI link ] [ poster ]
[ copyright notice ]
- Abstract
-
Drawing and writing are two of the most important human
activities when it comes to recording events and information.
Needless to say, the digitization of hand drawn paper is
important and thus many products and methods for capturing hand
drawings have been developed. However, many of these methods
require a special pen, paper, and/or apparatus. Thus, when we
want to capture hand drawings with these methods, we have to
capture the drawings actively, e.g., by preparing special pens
and paper. In this work, we try to capture automatically all
the hand drawings found in our daily lives without any explicit
action by the user. Recent advances in sensing technology
enable us to record our daily life data anywhere and at anytime
using small always-on wearable sensors. In this work, our aim
is to capture hand drawings automatically with an always-on
wearable sensor device equipped with a camera.
Kazuma Akamine, Ken Fukuchi, Akisato
Kimura, Shigeru Takagi
"Fully automatic extraction of salient regions in near
real-time,"
the Computer Journal, November 2010.
[ DOI link ]
[ copyright notice ]
- Abstract
-
Automatic video segmentation plays an important role in a wide
range of computer vision and image processing applications.
Recently, various methods have been proposed for this purpose.
The problem is that most of these methods are far from
real-time processing even for low-resolution videos due to the
complex procedures. To this end, we propose a new and quite
fast method for automatic video segmentation with the help of
(1) efficient optimization of Markov random fields with
polynomial time of the number of pixels by introducing graph
cuts, (2) automatic, computationally efficient but stable
derivation of segmentation priors using visual saliency and
sequential update mechanism and (3) an implementation strategy
in the principle of stream processing with graphics processor
units. Test results indicate that our method extracts
appropriate regions from videos as precisely as and much faster
than previous semi-automatic methods even though no
supervisions have been incorporated.
Gurbachan Sekhon, Akisato Kimura, Yasuhiro Minami, Hitoshi Sakano,
Eisaku Maeda
"Action planning for interactive visual scene understanding based
on knowledge confidence in latent spaces,"
IEICE Technical Report (domestic),
PRMU2010-83 (IBISML2010-5), Fukuoka, Japan, September 2010.
[ presentation material ]
[ copyright notice ]
- Abstract
-
This report proposes a method for action planning in
interactive visual scene understanding through the use of
knowledge confidence generated from a latent space of a topic
model connecting image features and text labels. We then use
information, within the latent space, about the position of an
input sample relative to training samples in order to simulate
knowledge confidence. Coupled with this, we also use the
overall associativity between each text label as determined by
the content of the training samples to determine the knowledge
confidence.
Akisato Kimura, Hirokazu Kameoka, Masashi
Sugiyama, Takuho Nakano, Eisaku Maeda, Hitoshi Sakano, Katsuhiko Ishiguro
"SemiCCA: Efficient semi-supervised learning of canonical correlations,"
Proc. IAPR International Conference on Pattern Recognition
(ICPR2010),
pp.2933--2936, Istanbul, Turkey, August 2010.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
Canonical correlation analysis (CCA) is a powerful tool for
analyzing multi-dimensional paired data. However, CCA tends to
perform poorly when the number of paired samples is limited,
which is often the case in practice. To cope with this problem,
we propose a semi-supervised variant of CCA named semiCCA that
allows us to incorporate additional unpaired samples for
mitigating over-fittng. The proposed method smoothly bridges
the eigenvalue problems of CCA and principal component analysis
(PCA), and thus its solution can be computed efficiently just
by solving a single eigenvalue problem as the original CCA.
Ukrit Watchareeruetai, Akisato Kimura, Robert Cheng Bao, Takahito
Kawanisi, Kunio Kashino
"StochasticSIFT: Interest point detection based on stochastically-
derived stability,"
Proc. Meeting on Image Recognition and Understanding (MIRU2010,
domestic),
IS1-80, Kushiro, Hokkaido, Japan, July 2010.
[ PDF ]
[ Poster ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
We propose a novel framework for detecting interest points
(IPs) in video sequence named "StochasticSIFT". The proposed
framework incorporates a stochastic model considering temporal
dynamics of videos into the SIFT detector to improve robustness
against some fluctuations inherently included in video signals.
Instead of detecting IPs followed by removing unstable or
inconsistent IP candidates, we introduce IP "stability" derived
from a stochastic model of inherent repeat ability and matching
rates.
Gurbachan Sekhon, Ken Fukuchi, Akisato Kimura
"Automatic and precise extraction of generic objects using
saliency-based priors and contour constraints,"
Proc. Meeting on Image Recognition and Understanding (MIRU2010,
domestic),
IS3-3, Kushiro, Hokkaido, Japan, July 2010.
[ PDF ]
[ Poster ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
This paper deals with automatic video segmentation without
supervision or interactions. We examine a method for automatic
noise reduction in segmented video frames utilizing contour
information, which we have dubbed the Contour-Classification
method. This method uses information about the contours of the
segmented image mask in order to accurately reduce noise in
segmented video frames. We will also examine which we have
developed, called the Erosion-Dilation method. Our proposed
method is then composed of these two fundamental techniques:
Contour-Classification and Erosion-Dilation. Test results
indicate our proposed method precisely removes noise regions
from videos with low error rate when compared with both the
original unaltered segmentation result and the Erosion-Dilation
method.
Shigeaki Kuzuoka, Akisato Kimura, Tomohiko Uyematsu
"Universal source coding for multiple decoders with side
information,"
Proc. International Symposium of Information Theory (ISIT2010),
pp.1-5, Austin Texas, USA, June 2010.
- Abstract
-
A multiterminal lossy source coding problem, which includes
various problems such as the Wyner-Ziv problem and the
complementary delivery problem as special cases, is considered.
It is shown that any point in the achievable ratedistortion
region can be attained even if the source statistics are not
known.
Akisato Kimura, Derek Pang, Tatsuto
Takeuchi, Kouji Miyazato, Junji Yamato and Kunio Kashino
"A stochastic model of human visual attention with a dynamic
Bayesian network,"
submitted to IEEE Transactions on Pattern Analysis and Machine
Intelligence.
[ pdf (arXiv.org) ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
Recent studies in the field of human vision science suggest
that the human responses to the stimuli on a visual display are
non-deterministic. People may attend to different locations on
the same visual input at the same time. Based on this
knowledge, we propose a new stochastic model of visual
attention by introducing a dynamic Bayesian network to predict
the likelihood of where humans typically focus on a video
scene. The proposed model is composed of a dynamic Bayesian
network with 4 layers. Our model provides a framework that
simulates and combines the visual saliency response and the
cognitive state of a person to estimate the most probable
attended regions. Sample-based inference with Markov chain
Monte-Carlo based particle filter and stream processing with
multi-core processors enable us to estimate human visual
attention in near real time. Experimental results have
demonstrated that our model performs significantly better in
predicting human visual attention compared to the previous
deterministic models.
Shigeaki Kuzuoka, Akisato Kimura, Tomohiko Uyematsu
"Universal source coding for multiple decoders with side information,"
Workshop on Shannon Theory Workshop (STW2009, domestic),
pp.35-40, Matsuyama, Ehime, Japan, September 2009.
- Abstract
-
A multiterminal lossy source coding problem, which includes various
problems such as the Wyner-Ziv problem and the complementary delivery
problem as a special case, is considered. It is clarified that any point
in the achievable rate-distortion region can be attained even if the
source statistics is not known.
Ken Fukuchi, Kouji Miyazato, Akisato Kimura, Shigeru Takagi and Junji
Yamato
"Saliency-based video segmentation with graph cuts and sequentially updated
priors,"
Proc. International Conference on Multimedia and Expo (ICME2009),
New York, New York, USA, June-July 2009.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
This paper proposes a new method for achieving precise video
segmentation without any supervision or interaction. The main
contributions of this report include 1) the introduction of fully
automatic segmentation based on the maximum a posteriori (MAP)
estimation of the Markov random field (MRF) with graph cuts and
saliency-driven priors and 2) the updating of priors and feature
likelihoods by integrating the previous segmentation results and the
currently estimated saliency-based visual attention. Test results
indicate that our new method precisely extracts probable regions from
videos without any supervised interactions.
Kouji Miyazato, Akisato Kimura, Shigeru Takagi and Junji Yamato
"Real-time estimation of human visual attention with dynamic Bayesian
network and MCMC-based particle filter",
Proc. International Conference on Multimedia and Expo (ICME2009),
New York, New York, USA, June-July 2009.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
Recent studies in signal detection theory suggest that the human
responses to the stimuli on a visual display are non-deterministic.
People may attend to different locations on the same visual input at the
same time. Constructing a stochastic model of human visual attention
would be promising to tackle the above problem. This paper proposes a
new method to achieve a quick and precise estimation of human visual
attention based on our previous stochastic model with a dynamic Bayesian
network. A particle filter with Markov chain Monte-Carlo (MCMC) sampling
make it possible to achieve a quick and precise estimation through
stream processing. Experimental results indicate that the proposed
method can estimate human visual attention in real time and more
precisely than previous methods.
Akisato Kimura, Derek Pang, Tatsuto Takeuchi, Junji Yamato and Kunio
Kashino
"Dynamic Markov random fields for stochastic modeling of visual
attention",
IEICE Technical Report (domestic),
PRMU2008-117 (MVE2008-66), Toyonaka, Osaka, Japan, November 2008.
[ presentation material ]
[
copyright notice ]
- Abstract
-
This report proposes a new stochastic model of visual attention to
predict the likelihood of where humans typically focus on a video scene.
The proposed model is composed of a dynamic Bayesian network that
simulates and combines a person's visual saliency response and eye
movement patterns to estimate the most probable regions of attention.
Dynamic Markov random field (MRF) models are newly introduced to include
spatiotemporal relationships of visual saliency responses. Experimental
results have revealed that the proposed model outperforms the previous
deterministic model and the stochastic model without dynamic MRF in
predicting human visual attention.
Akisato Kimura,
"Particle-based simulation of the Gel'fand-Pinsker channel capacity and
the Wyner-Ziv rate-distortion function,"
Proc. Symposium on Information Theory and its Applications (SITA2008,
domestic),
pp.2-4-4, Kinugawa, Tochigi, Japan, October 2008.
[ presentation material ]
[
copyright notice ]
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyright of the material.]
- Abstract
-
This report presents a new numerical algorithm for simulating the
capacity of a memoryless channel with non-causal encoder side information
(the Gel'fand-Pinsker channel) and the rate-distortion function for a
memoryless source with decoder side information (the Wyner-Ziv coding).
The basic idea is to represent a probabilistic density by a finite number
of particles each of which is composed of a sample value and the
associated weight. The proposed algorithm enables us to simulate the
channel capacity and the rate distortion function with infinite or
continuous alphabets.
Shigeaki Kuzuoka, Akisato Kimura and Tomohiko Uyematsu,
"Simple coding schemes for lossless and lossy complementary delivery
problems,"
Proc. Shannon Theory Workshop (STW2007, domestic),
pp.43-50, Izu, Shizuoka, Japan, September 2007.
- Abstract
-
This paper deals with a coding problem called complementary delivery,
where messages from two correlated sources are jointly encoded, and each
decoder reproduces one of two messages using the other message as the
side information. Simple lossless and lossy complementary delivery coding
schemes are proposed. In the lossless case, it is revealed that the error
probability of the proposed code based on Slepian-Wolf codes is
exponentially tight. Moreover, in the lossy case, it is demonstrated that
Wyner-Ziv codes can be applied to complementary delivery problem.
Kunio Kashino, Akisato Kimura, Takayuki Kurozumi and Hidehisa Nagano
"Robust search methods for music signals based on simple
representation,"
Proc. International Conference on Acoustics, Speech and Signal Processing
(ICASSP2007),
Vol.4, pp.1421--1424, Hawaii, USA, April 2007.
[ DOI link ]
[ copyright notice ]
- Abstract
-
Signal similarity search is an important technique for music information
retrieval. A basic task is finding identical signal segments on unlabeled
music-signal archives, given a short music signal fragment as a query. In
such a task, the search must be fast and sufficiently robust against
possible signal fluctuations due to noise and distortions. In this
special session paper, we describe a search method designed to cope with
additive interfering sounds by spectral partitioning. Then, we introduce
another method designed to be robust under multiplicative noise or
distortion based on binary area representation.
Takahito Kawanishi, Masaru Tsuchida, Shigeru Takagi, Akisato Kimura
and Junji Yamato
"Small cylindrical display using asherical mirror for anthropomorphic
agents",
Proc. International Display Workshop / Asia Display (IDW/AD'05),
pp.1755-1758, Takamatsu, Kagawa, Japan, December 2005.
- Abstract
-
We have developed a small cylindrical display for an anthropomorphic
agent that communicates with mul-tiple users in a 3D environment. The
previously developed cylindrical display was dark with bad contrast at
the lower part of the screen because the density of pixels at the lower
part is much less than at the upper part. We improved that the pixel
density is uniform using aspherical mirror. Experimental results show our
new display has better luminance and better contrast than previous one.
Kunio Kashino, Akisato Kimura and Takayuki Kurozumi
"A quick video search method based on local and global
feature pruning",
Proc. International Conference on Pattern Recognition (ICPR2004)”¤
Vol.3, pp.894-897, August 2004.
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper proposes a quick method of similarity-based video searching to
detect and locate a specific video clip given as a query in a stored long
video stream. The method employs a two-stage process: local and global
feature clustering. The local clustering exploits continuity or local
similarities between video features, and the global clustering gathers
similar video frames that are not necessarily adjacent to each other.
These processes prune irrelevant sections on a video stream. The method
guarantees the exactly same search result as the exhaustive search.
Experiments performed on a PC show that the proposed method can correctly
detect and locate a 7.5-second clip in a 150-hour video recording in 15
ms on average.
Akisato Kimura, Derek Pang, Tatsuto Takeuchi, Junji Yamato and Kunio
Kashino
"Dynamic Markov random fields for stochastic modeling of visual
attention",
Proc. International Conference on Pattern Recognition (ICPR2008),
Mo.BT8.35, Tampa, Florida, USA, December 2008.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
This report proposes a new stochastic model of visual attention to
predict the likelihood of where humans typically focus on a video scene.
The proposed model is composed of a dynamic Bayesian network that
similates and combines a person's visual saliency response and eye
movement patterns to estimate the most probable regions of attention.
Dynamic Markov random field (MRF) models are newly introduced to include
spatiotemporal relationships of visual saliency responses. Experimental
results have revealed that the propose model outperforms the previous
deterministic model and the stochastic model without dynamic MRF in
predicting human visual attention.
Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and
Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian
network",
Proc. Meeting on Image Recognition and Understanding (MIRU2008,
domestic),
pp. 1500--1505, Karuizawa, Nagano, Japan, July 2008.
(Selected as
Best Interactive Session Award
)
[ pdf ]
[ digest ]
[ poster: Japanese,
English ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
(The content is almost the same as the one presented in IEICE Technical
Meeting held in June 2008. Please see
here.)
Shigeaki Kuzuoka, Akisato Kimura and Tomohiko Uyematsu
"Universal coding for lossy complementary delivery problems",
Proc. International Symposium on Information Theory (ISIT2008),
pp. 2177--2188, Toronto, Canada, July 2008.
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper deals with a universal lossy coding problem for a certain kind
of multiterminal source coding network called a complementary delivery
system. A universal coding scheme based on Wyner-Ziv codes is proposed.
While the proposed scheme cannot attain the optimal rate-distortion
trade-off in general, the rate-loss is upper bounded by a universal
constant under some mild conditions. Moreover, the proposed scheme allows
us to apply (non-universal) Wyner-Ziv codes to construct a universal
lossy complementary delivery code.
Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and
Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian
network",
Proc. International Conference on Multimedia and Expo (ICME2008),
pp.1073--1076, Hannover, Germany, June 2008.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
Recent studies in signal detection theory suggest that the human
responses to the stimuli on a visual display are non-deterministic.
People may attend to different locations on the same visual input at the
same time. To predict the likelihood of where humans typically focus on a
video scene, we propose a new stochastic model of visual attention by
introducing a dynamic Bayesian network. Our model simulates and combines
the visual saliency response and the cognitive state of a person to
estimate the most probable attended regions. Experimental results have
demonstrated that our model performs significantly better in predicting
human visual attention compared to the previous deterministic model.
Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and
Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian
network",
IEICE Technical Report (domestic),
PRMU2008-43 (DE2008-25), Otaru, Hokkaido, Japan, June 2008.
[ PDF ]
[ presentation material ]
[
copyright notice ]
- Abstract
-
Recent studies in signal detection theory suggest that the human
responses to the stimuli on a visual display are non-deterministic.
People may attend to different locations on the same visual input at the
same time. To predict the likelihood of where humans typically focus on a
video scene, we propose a new stochastic model of visual attention by
introducing a dynamic Bayesian network. Our model simulates and combines
the visual saliency response and the cognitive state of a person to
estimate the most probable attended regions. Experimental results have
demonstrated that our model performs significantly better in predicting
human visual attention compared to the previous deterministic model.
Akisato Kimura, Tomohiko Uyematsu, Shigeaki Kuzuoka and Shun Watanabe,
"Universal source coding over generalized complementary delivery
networks,"
IEEE Transactions on Information Theory,
Vol.55, No.3, pp.1360-1373, March 2009.
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper deals with a universal coding problem for a certain kind of
multiterminal source coding networks called a generalized complementary
delivery network. In this network, messages from multiple correlated
sources are jointly encoded, and each decoder has access to some of the
messages to enable the decoder to reproduce the other messages. Both
fixed-to-fixed length and fixed-to-variable length lossless coding
schemes are considered. Explicit constructions of universal codes and the
bounds of the error probabilities are clarified via methods of types and
graph-theoretical analysis.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources over generalized complementary
delivery networks,"
Proc. Symposium on Information Theory and its Applications
(SITA2007, domestic),
pp.274-279, Shima, Mie, November 2007.
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
This paper deals with a universal coding problem for a certain kind of
multiterminal source coding networks called a generalized complementary
delivery network. In this network, messages from multiple correlated
sources are jointly encoded, and each decoder has access to some of the
messages to enable the decoder to reproduce the other messages. Both
fixed-to-fixed length and fixed-to-variable length lossless coding
schemes are considered. Explicit constructions of universal codes and the
bounds of the error probabilities are clarified via methods of types and
graph-theoretical analysis.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for audio signals based on a piecewise linear
representation of feature trajectories,"
IEEE Transactions on Audio, Speech and Language Processing,
Vol.16, No.2, pp.396-407, February 2008.
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper presents a new method for a quick similarity-based search
through long unlabeled audio streams to detect and locate audio clips
provided by users. The method involves feature dimension reduction based
on a piecewise linear representation of a sequential feature trajectory
extracted from a long audio stream. Two techniques enable us to obtain a
piecewise linear representation: the dynamic segmentation of feature
trajectories and the segment-based KL transform. A new technique is also
introduced that greatly reduces the required feature comparisons. The
proposed search method guarantees in principle that no segment to be
detected is missed. Experiments indicate significant improvements in
search speed. For example the proposed method reduced the total search
time to approximately 1/12 and detected queries in approximately 0.3
seconds from a 200-hour audio database.
Akisato Kimura,
"Coding theorems for correlated sources with cooperative encoders,"
Ph.D dissertation, Tokyo Institute of Technology, September 2007.
[ pdf ]
[ presentation material ]
[ Copyright notice: The author holds the copyright of the material. ]
- Abstract
-
This thesis deals with multiterminal source coding problems for a general
framework of coding systems, called coding systems with cooperation,
where there are some linkages among encoders and decoders. Especially,
the main focus of this thesis is encoder cooperation. Two types of coding
systems are investigated that incorporate encoder cooperation: the
Slepian-Wolf coding system with linkages (called the SWL system) and the
complementary delivery coding system.
The SWL system involves some mutual linkages between two encoders of the
coding system investigated by Slepian and Wolf (called the SW system)
that involves two separate encoders and one common decoder. Especially,
some special cases are considered, where the coding rate for the mutual
linkage between two encoders is negligibly small. The main results in
this thesis shows that the achievable rate region of the SWL system
equals that of the SW system when considering fixed-length coding, while
weak variable-length coding makes the achievable rate region of the SWL
system larger than that of the SW system. This implies that encoder
cooperation may improve the coding rate.
The complementary delivery coding system contrasts with the SW system in
the sense of cooperation, which means that the complementary delivery
coding system consists of a common encoder and separate decoders, while
the SW system includes separate encoders and a common decoder.
Especially, in the complementary delivery coding system, each decoder has
access to some of encoded messages to enable the decoder to reproduce the
other messages from a common codeword emitted from the common encoder.
First, the minimum achievable rate for lossy coding is clarified, which
implies that encoder cooperation may increase the coding rate. Next,
universal coding schemes for lossless coding are proposed. Explicit
constructions of universal lossless codes and the bounds of the error
probabilities are clarified by using methods of types and the
graph-theoretical analysis.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka ,
"Universal coding for correlated sources with complementary delivery,"
IEICE Transactions on Fundamentals,
Vol.E90-A, No.9, pp.1840-1847, September 2007.
Pulished online in
IEICE Transaction Online.
[ pdf ]
[ DOI link ]
[
copyright notice ]
- Abstract
-
This paper deals with a universal coding problem for a certain kind of
multiterminal source coding system that we call the complementary
delivery coding system. In this system, messages from two correlated
sources are jointly encoded, and each decoder has access to one of the
two messages to enable it to reproduce the other message. Both
fixed-to-fixed length and fixed-to-variable length lossless coding
schemes are considered. Explicit constructions of universal codes and
bounds of the error probabilities are clarified via type-theoretical and
graph-theoretical analyses.
Clement Leung, Akisato Kimura, Tatsuto Takeuchi and Kunio Kashino
"A computational model of saliency depletion/recovery phenomena for the
salient region extraction of videos",
Proc. Meeting on Image Recognition and Understanding
(MIRU2007, domestic),
pp.582--587, Hiroshima, Japan, July 2007.
[ pdf ]
[ poster ]
[
copyright notice ]
- Abstract
-
This report proposes a new algorithm for extracting salient regions of
videos by introducing two important properties of the early human visual
system: 1) Instantaneous saliency depletion with gradual recovery,
whereby saliency is instantaneously suppressed and gradually recovered in
previously attended regions. 2) Gradual saliency depletion with
instantaneous recovery, whereby saliency is gradually decreased over time
in non-surprising regions and at the same time recovered in surprising
locations. With the introduction of these properties, redundant
information in videos can be suppressed and important information is
eventually enhanced. The proposed algorithm has been evaluated with an
eye tracking device to see how well it fits the human visual system. The
results show that the proposed algorithm substantially outperformed
previous algorithms when only gradual depletion was incorporated, and
instantaneous depletion improved the performance in some cases.
Clement Leung, Akisato Kimura, Tatsuto Takeuchi and Kunio Kashino
"A computational model of saliency depletion/recovery phenomena for the
salient region extraction of videos",
Proc. International Conference on Multimedia and Expo (ICME2007),
pp.300--303, Beijing, China, July 2007.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
This paper proposes a new algorithm for extracting salient regions of
videos by introducing two important properties of the early human visual
system: 1) Instantaneous saliency depletion with gradual recovery,
whereby saliency is insantaneously suppressed and gradually recovered in
previously attended regions. 2) Gradual saliency depletion with
instantaneous recovery, whereby saliency is gradually decreased over time
in non-surprising regions and at the same time recovered in surprising
locations. With the introduction of these properties, redundant
information in videos can be suppressed and important information is
eventually enhanced.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources with generalized complementary
delivery,"
presented at a recent result session,
International Symposium on Information Theory (ISIT2007),
Nice, France, June 2007.
[ poster ]
[ copyright notice ]
- Abstract
-
This presentation deals with a universal coding problem for a certain
kind of multiterminal source coding system called the generalized
complementary delivery coding system. In this system, messages from
multiple correlated sources are jointly encoded, and each decoder has
access to some of the messages to enable them to reproduce the other
messages. Both fixed-to-fixed length and fixed-to-variable length
lossless coding schemes are considered. Explicit constructions of
universal codes and the bounds of the error probabilities are clarified
via methods of types and graph-theoretical analyses.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources with complementary delivery,"
Proc. International Symposium on Information Theory (ISIT2007),
pp.1756--1760, Nice, France, June 2007.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
This report deals with a universal coding problem for a certain kind of
multiterminal source coding system that we call the complementary
delivery coding system. Both fixed-to-fixed length and fixed-to-variable
length lossless coding schemes are considered. Explicit constructions of
universal codes and the bounds of the error probabilities are clarified
via type-theoretical and graph-theoretical analyses.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka ,
"Universal source coding for complementary delivery,"
Proc. Symposium on Information Theory and its Applications
(SITA2006, domestic),
pp.803--806, Hakodate, Japan, November-December 2006.
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
This paper deals with a universal coding problem for a certain kind of
multiterminal source coding system that we call complementary delivery
coding system. Both fixed-to-fixed length and fixed-to-variable length
lossless coding schemes are considered. Explicit constructions of
universal codes and bounds of the error probabilities are alarified via
type-theoretical and graph-theoretical analyses.
Akisato Kimura and Tomohiko Uyematsu ,
"Information-theoretical analysis of index searching: Revised,"
Proc. Symposium on Information Theory and its Applications
(SITA2006, domestic),
pp.73--76, Hakodate, Japan, November-December 2006.
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
We present an information-theoretical viewpoint for similarity-based
retrieval along with index structures. This retrieval system comprises
two stages: pruning data items based on the index structures, and
matching surviving data items. The first stage is modeled as so-called
Wyner-Ziv problem, while the second stage is considered as a coding
problem such that parts of the decoding results are available as partial
side information at both of the encoder and decoder. We clarify upper
and lower bounds of the optimal retrieval performances and some
relationships between retrieval parameters and performances via
shannon-theoretic analyses.
Akisato Kimura and Tomohiko Uyematsu ,
"Multiterminal source coding with complementary delivery,"
Proc. International Symposium on Information Theory and its Applications
(ISITA2006),
pp.189-194, Seoul, South Korea, Octover 2006.
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
A coding problem where messages from two correlated sources are jointly
encoded and separately decoded is investigated. Each decoder has access
to one of the two messages to enable it to reproduce the other message.
The rate-distortion function for lossy coding is clarified. Some related
coding problems are also examined.
Akisato Kimura, Tomohiko Uyematsu
"Multiterminal source coding for cascading and feedback refinement
systems,"
Prof. Shannon Theory Workshop (STW2006, domestic),
pp.25-31, September 2006
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
- Lossy coding problems are investigated for some communication systems
in the presense of cascading and/or feedback information channels from
decoders so as to refine reproduction messages. This framework provides
different types of refinement structures from so-called successive
refinement. Three different types of communication systems are
considered, i.e. refinement systems in the presense of a cascading
channel, a feedback channel, and both channels. Outer and inner bounds
of achievable rate-distortion regions for those problems are obtained.
Akisato Kimura and Tomohiko Uyemats ,
"Multiterminal source coding with complementary delivering,"
IEICE Technical Report,
IT2006-8, pp.7-12, May 2006,
Presented at 2006 Hawaii, IEICE and SITA Joint Conference on Information
Theory.
[ presentation material ]
[
copyright notice ]
- Abstract
-
We consider a coding problem where messages from two correlated sources
are jointly encoded and separately decoded. Each decoder has access to
one of two messages to reproduce the other message. We clarify the
rate-distortion function for lossy coding.
Akisato Kimura, Takahito Kawanishi and Kunio Kashino,
"Acceleration of similarity-based partial image retrieval using multistage
vector quantization,"
Proc. International Conference on Pattern Recognition (ICPR2004),
Vol.2, pp.993-996, Cambridge, United Kingdom, August 2004.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyrigthe notice ]
- Abstract
- We propose a new method for quick and accurate partial image retrieval
from a huge number of images based on a predefined distance measure.
The proposed method utilizes vector quantization (VQ) on multiple
layers, namely color, block, and feature layers. This can greatly
reduce the amount of calculation needed for partial image retrieval.
Experiments indicate that the proposed method can detect partial images
that are similar to queries through 1000 images within 4 seconds. This
is approximately 30 times faster than the method to which multistage VQ
is not applied.
Akisato Kimura, Takahito Kawanishi and Kunio Kashino,
"Similarity-based partial image retrieval guaranteeing same accuracy as
exhaustive matching,"
Proc. International Conference on Multimedia and Expo (ICME2004),
Vol. 3, pp.1895-1898, Taipei, Taiwan, June 2004.
[ pdf ]
[ poster ]
[ copyright notice ]
- Abstract
- We propose a new framework for quick and accurate partial image
retrieval from a huge number of images based on a predefined distance
measure. Finding partial similarities generally requires a huge amount
of storage space for indexes due to the large number of portions of
images. The proposed method extracts portions from each database image
at a constant spacing, while it extracts all possible portions from a
query image. In this way, the proposed method can greatly reduce the
size of indexes while theoretically guaranteeing the same accuracy as
exhaustive matching.
Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-length Slepian-Wolf coding with linked encoders for mixed
sources,"
IEEE Transactions on Information Theory,
Vol.50, No.1, pp.183-193, Jan. 2004.
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
- Coding problems for correlated information sources were first
investigated by Slepian and Wolf. They considered the data compression
system, called the SW system, where two sequences emitted from
correlated sources are separately encoded to codewords, and sent to a
single decoder which has to output the original sequence pairs with a
small probability of error. In this correspondence, we investigate the
coding problem of a modified SW system allowing two encoders to
communicate with zero rate. First, we consider the fixed-length coding
and clarify that the admissible rate region for general sources is
equal to that of the original SW system. Next, we investigate the
variable-length coding having the asymptotically vanishing probability
of error. We clarify the admissible rate region for mixed sources
characterized by two ergodic sources and show that this region is
strictly wider than that for fixed-length codes. Further, we
investigate the universal coding problem for memoryless sources in the
sysyem and show that the SW system with linked encoders has much more
flexibility than the original SW system.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for multimedia signals using global pruning,"
Systems and Computers in Japan,
Vol.34, No.13, pp.47-58, November 2003.
[ DOI link ]
- Abstract
- The authors propose a new method for quickly searching for a specific
audio or video signal to be detected within a long, stored audio or
video stream to determine segments that contain signals that are nearly
identical to the given signal. The Time-series Active Search (TAS)
method is one of the quick search methods that have been proposed
previously. This singal searching technique based on histograms
extracted from the signals had implemented quick searching by local
pruning, that is, omitting comparisons of segmentsfor which searching
was unnecessary based on similarities in the vicinity of the matching
window. In contrast, the proposed technique implements significantly
quicker searching by introducing global pruning, which looks at the
entire signal time-series according to histogram classifications based
on similarities of the entire signal to eliminate segments that need
not be searched, in addition to local pruning. In this paper, the
authors present a detailed discussion of the relationship between the
degree of global pruning and the accuracy that is guaranteed. For
example, the authors showed through experimentsthat when 128-dimensions
histograms were classified to 1024 clusters, the proposed technique
achieved a search speed approximately 9 times that of TAS while
preserving the same degree of accuracy. The preprocessing calculation
time increased by approximately 1% of the time for playing the signal.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"Dynamic-segmentation-based feature dimension reduction
for quick audio/video searching,"
Proc. International Conference on Multimedia and Expo (ICME2003),
Vol.2, pp.389-392, Baltimore, Maryland, USA, July 2003.
Proc. International Conference on Acoustics,
Speech and Signal Processing (ICASSP2003),
Vol.3, pp.357-360, Hong Kong, Apr. 2003 (cancelled).
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
- We propose a new feature dimension reduction method for multimedia
search. The main technique in the method is dynamic segmentation that
partitions sequential feature trajectories dynamically. While dynamic
segmentation reduces the average dimensionality and accelerates the
search, it requires huge amount of calculation. Thus, our method
quickly executes suboptimal partitioning of the trajectories by using
the discreteness of dimension changes. This guarantees the optimal
amount of calculation to derive the suboptimal partitioning under the
condition that the dimension monotonously increases as the segment
length increases. The experiment shows that our method is over 10 times
faster than a straightforward dynamic segmentation method.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for multimedia signals using feature compression
based on piecewise linear maps,"
Proc. International Conference on Acoustics, Speech and Signal Processing
(ICASSP2002),
Vol.4, pp.3656-3659, Orlando, Florida, USA, May 2002.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
We propose a quick algorithm for multimedia signal search. The
algorithm comprises two techniques: feature compression based on
piecewise linear maps and distance bounding to efficiently limit the
search space. When compared with existing multimedia search techniques,
they greatly reduce the computational cost required in searching.
Although feature compression is employed in our method, our bounding
technique mathematically guarantees the same recall rate as the search
based on the original features; no segment to be detected is missed.
Experiments indicate that the proposed algorithm is approximately 10
times faster than and as accurate as an existing fast method maitaining
the same search accuracy.
Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-lenth Slepian-Wolf coding with linked encoders for mixed
source,"
Proc. IEEE Information Theory Workshop 2001 (ITW2001),
pp.82--84, Cairns, Australia, Sep. 2001
[ pdf ]
[ DOI link ]
[ Copyright notice ]
- Abstract
-
Slepian and Wolf first considered the data compression of correlated
sources called the SW system, where two sequences emitted from
correlated sources are separately encoded to codewords, and sent to a
single decoder which has to output original sequence pairs. Resently,
Oohama has extended the SW system and investigated a more general case
where there are come mutual linkages between two encoders of the SW
system. In this papar, we investigate variable-length coding which
allows asymptotically vanishing probability of error for the system
considered by Oohama. We clarify the admissible rate region for mixed
sources, and show that this region is strictly wider than that for
fixed-length codes.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"Very quick audio searching : Introducing global pruning to the
Time-Series Active Search,"
Proc. International Conference on Acoustics, Speech and Signal Processing
(ICASSP2001),
Vol.3, pp.1429-1432, Salt Lake City, Utah, USA, May 2001.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
Previously, we proposed a histogram-based quick signal search method
called Time-Series Active Search (TAS). TAS is a method of searching
through long audio or video recordings for a specified segment, based
on signal similarity. TAS is fast; it can search through a 24-hour
recording in 1 second after a query-independent preprocessing. However,
an even faster method is required when we consider huge amount of audio
archives, for example a month's worth of recordings. Thus, we propose a
preprocessing method that significantly accelerates TAS. The core part
of this method comprises a global histogram clustering of long signals
and a pruning scheme using those clusters. Tests using broadcast
recording indicate that the proposed algorithm achieves the search
speed approximately 3 to 30 times faster than TAS. In these tests,
the search results are exactly the same as with TAS.
Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-length Slepian-Wolf coding with linked encoders for mixed
sources,"
IEICE Technical Report,
IT99-59, pp.7-12, Jan. 2000.
[
copyright notice ]
- Abstract
-
Coding problems for correlated information sources were first
investigated by Slepian and Wolf, where sequences from two correlated
sources are separately encoded, sent to a single decoder and decoded
with sufficiently small probability of error. We investigate the coding
theorem for correlated two sources, where there are some mutual
linkages between two encoders of the coding system proposed by Slepian
and Wolf. We consider weak variable-length coding, i.e. variable-length
code having vanishing error, and show the achievable rate region for
mixed sources characterized by two ergodic sources.
Akisato Kimura and Tomohiko Uyematsu,
"Large deviations performance of interval algorithm for random number
generation,"
Proc. Memorial workshop for the 50th anniversary of the Shannon theory,
pp.1-4, Yamanashi, Japan, Jan. 1999
[ pdf ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
We investigate large deviations performance of the interval algorithm
for random number generation, especially for intrinsic randomness.
First, we show that the length of output fair random bits per the
length of input sequence approaches to the entropy of the source almost
surely. Next, we consider to obtain the fixed number of fair random
bits from the input sequence with fixed length. We show that the
approximation error measured by the variational distance and divergence
vanishes exponentially as the length of input sequence tends to
infinity, if the number of fair bits per input sample is below the
entropy of the source. Contrarily, the approximation error measureby
the variational distance approaches to two exponentially, if the number
of fair bits per input sample is above the entropy.
Nobukazu Takai, Akisato Kimura and Nobuo Fujii,
"CMOS FET companding current-mode integrator,"
Proc. IEEE Asia-Pacific Conference on Circuit and Systems (APCCS98),
pp.17-20, Chiangmai, Thailand, Nov. 1998
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
A new CMOS companding current-mode integrator is proposed. The
companding integrator is based on MOS TransLinear principle and
utilizes a nature of MOSFET square-law. SPICE simulation results
demonstrate good performances.
|