-
K. Matsuda, Y. Wada, and K. Sugiura. "DENEB: A Hallucination-Robust Automatic Evaluation Metric for Image Captioning", ACCV, 2024, to appear. (acceptance rate = 32%)
-
M. Goko, M. Kambara, S. Otsuki, D. Saito, and K. Sugiura, "Task Success Prediction for Open-Vocabulary Manipulation Based on Multi-Level Aligned Representations", CoRL, 2024, to appear. (acceptance rate = 38.2%)
-
K. Kaneda, S. Nagashima, R. Korekata, M. Kambara, and K. Sugiura, "Learning-To-Rank Approach for Identifying Everyday Objects Using a Physical-World Search Engine", IEEE RAL presented at IEEE/RSJ IROS, 2024.
-
T. Nishimura, K. Kuyo, M. Kambara, and K. Sugiura, "Object Segmentation from Open-Vocabulary Manipulation Instructions Based on Optimal Transport Polygon Matching with Multimodal Foundation Models", IEEE/RSJ IROS, 2024.
-
S. Otsuki, T. Iida, F. Doublet, T. Hirakawa, T. Yamashita, H. Fujiyoshi, and K. Sugiura, "Layer-Wise Relevance Propagation with Conservation Property for ResNet", ECCV, 2024. (acceptance rate = 27.9%)
-
Y. Wada, K. Kaneda, D. Saito, and K. Sugiura,
"Polos: Multimodal Metric Learning from Human Feedback for Image Captioning",
CVPR, pp. 13559-13568, 2024. (acceptance rate = 23.6%)
Poster (highlight) : Top 3.6% out of 11,532 paper submissions
-
N. Hosomi, Y. Iioka, S. Hatanaka, T. Misu, K. Yamada, and K. Sugiura: “Target Position Regression from Navigation Instructions”, IEEE ICRA, 2024 [poster].
-
R. Korekata, K. Kanda, S. Nagashima, Y. Imai, and K. Sugiura: “Multimodal Ranking for Target Objects and Receptacles Based on Open-Vocabulary Instructions”, IEEE ICRA, 2024 [poster].
-
Y. Wada, K. Kaneda, and K. Sugiura,
"JaSPICE: Automatic Evaluation Metric Using
Predicate-Argument Structures for Image Captioning Models", CoNLL, 2023. (acceptance rate = 28%)
-
Y. Iioka, Y. Yoshida, Y. Wada, S. Hatanaka and K. Sugiura, "Multimodal Diffusion Segmentation Model for Object Segmentation from Manipulation Instructions", IEEE/RSJ IROS, 2023.
-
S. Otsuki, S. Ishikawa and K. Sugiura, "Prototypical Contrastive Transfer Learning for Multimodal Language Understanding", IEEE/RSJ IROS, 2023.
-
R. Korekata, M. Kambara, Y. Yoshida, S. Ishikawa, Y. Kawasaki, M. Takahashi and K. Sugiura, "Switching Head–Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks", IEEE/RSJ IROS, 2023.
-
K. Kaneda, R. Korekata, Y. Wada, S. Nagashima, M. Kambara, Y. Iioka, H. Matsuo, Y. Imai, T. Nishimura, and K. Sugiura, "DialMAT: Dialogue-Enabled Transformer with Moment-Based Adversarial Training", CVPR 2023 Embodied AI Workshop, 2023. (1st Place in DialFRED Challenge)
-
M. Kambara and K. Sugiura, "Fully Automated Task Management for Generation, Execution, and Evaluation: A Framework for Fetch-and-Carry Tasks with Natural Language Instructions in Continuous Space", CVPR 2023 Embodied AI Workshop, 2023.
-
K. Kaneda, Y. Wada, T. Iida, N. Nishizuka, Y Kubo, K. Sugiura, "
Flare Transformer: Solar Flare Prediction using Magnetograms and Sunspot Physical Features", ACCV, pp. 1488-1503, 2022. (acceptance rate = 33.4%)
-
T. Iida, T. Komatsu, K. Kaneda, T. Hirakawa, T. Yamashita, H. Fujiyoshi, K. Sugiura, "Visual Explanation Generation Based on Lambda Attention Branch Networks", ACCV, pp. 3536-3551, 2022. (acceptance rate = 33.4%)
-
H. Matsuo, S. Hatanaka, A. Ueda, T. Hirakawa, T. Yamashita, H. Fujiyoshi, K. Sugiura, "Collision Prediction and Visual Explanation Generation Using Structural Knowledge in Object Placement Tasks", IEEE/RSJ IROS, 2022 [poster].
-
R. Korekata, Y. Yoshida, S. Ishikawa, K. Sugiura, "Switching Funnel UNITER: Multimodal Instruction Comprehension for Object Manipulation Tasks", IEEE/RSJ IROS, 2022 [poster].
-
M. Kambara, K.Sugiura, "Relational Future Captioning Model for Explaining Likely Collisions in Daily Tasks", IEEE ICIP, 2022.
-
S. Ishikawa, K. Sugiura, "Moment-based Adversarial Training for Embodied Language Comprehension", IEEE ICPR, 2022.
-
T. Matsubara, S. Otsuki, Y. Wada, H. Matsuo, T. Komatsu, Y. Iioka, K. Sugiura and H. Saito,
"Shared Transformer Encoder with Mask-Based 3D Model Estimation for Container Mass Estimation", IEEE ICASSP, pp.9142–9146, 2022.
-
S. Matsumori, K. Shingyouchi, Y. Abe, Y. Fukuchi, K. Sugiura, M. Imai,
"Unified Questioner Transformer for Descriptive Question Generation in Goal-Oriented Visual Dialogue",
ICCV, pp. 1898-1907, 2021. (acceptance rate = 25.9%)
-
M. Kambara and K. Sugiura,
"Case Relation Transformer: A Crossmodal Language Generation Model for Fetching Instructions",
IEEE RAL presented at IEEE/RSJ IROS,
2021.
-
S. Ishikawa and K. Sugiura,
"Target-dependent UNITER: A Transformer-Based Multimodal Language Comprehension Model for Domestic Service Robots",
IEEE RAL presented at IEEE/RSJ IROS,
2021.
-
A. Magassouba, K. Sugiura, and H. Kawai,
"CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language Navigation",
IEEE RAL presented at IEEE/RSJ IROS,
2021.
-
H. Itaya, T. Hirakawa, T. Yamashita, H. Fujiyoshi and K. Sugiura,
"Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning",
IJCNN,
2021.
-
T. Ogura, A. Magassouba, K. Sugiura, T. Hirakawa, T. Yamashita, H. Fujiyoshi, H. Kawai,
"Alleviating the Burden of Labeling: Sentence Generation by Attention Branch Encoder-Decoder Network",
IEEE RAL presented at IEEE/RSJ IROS,
2020.
-
P. Shen, X. Lu, K. Sugiura, S. Li, H. Kawai,
"Compensation on x-vector for Short Utterance Spoken Language Identification",
Odyssey 2020 The Speaker and Language Recognition Workshop,
pp. 47-52,
Tokyo, Japan,
2020.
-
A. Magassouba, K. Sugiura, H. Kawai,
"A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects",
IEEE RAL presented at IEEE ICRA,
2020.
-
A. Magassouba, K. Sugiura, A. Trinh Quoc, H. Kawai,
"Understanding Natural Language Instructions for Fetching Daily Objects Using
GAN-Based Multimodal Target-Source Classification",
IEEE Robotics and Automation Letters presented at IEEE/RSJ IROS,
Macau, China,
2019.
-
A. Magassouba, K. Sugiura, H. Kawai,
"Multimodal Attention Branch Network for Perspective-Free Sentence Generation",
Conference on Robot Learning (CoRL),
Osaka, Japan,
2019. (acceptance rate = 27.6%)
-
A. Nakayama, A. Magassouba, K. Sugiura, H. Kawai:
"PonNet: Object Placeability Classifier for Domestic Service Robots,"
Third International Workshop on Symbolic-Neural Learning (SNL-2019),
Tokyo, Japan,
July 11-12, 2019 [poster].
-
A. Magassouba, K. Sugiura, H. Kawai,
"A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks from Ambiguous Language Instructions",
IEEE Robotics and Automation Letters presented at IEEE/RSJ IROS,
Madrid, Spain,
2018.
IROS 2018 RoboCup Best Paper Award
-
K. Sugiura,
"SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Networks by Drones",
IEEE Robotics and Automation Letters presented at IEEE/RSJ IROS,
Madrid, Spain,
2018.
-
N. Nishizuka, K. Sugiura, Y. Kubo, M. Den, S. Watari and M. Ishii,
"Solar Flare Prediction Using Machine Learning with Multiwavelength Observations",
In Proc. IAU Symposium 335,
Exeter, UK,
vol.13, pp.310-313,
2018.
-
K. Sugiura and H. Kawai,
"Grounded Language Understanding for Manipulation Instructions Using GAN-Based Classification",
In Proc. IEEE ASRU,
Okinawa, Japan,
pp. 519-524, 2017.
-
K. Sugiura and K. Zettsu,
"Analysis of Long-Term and Large-Scale Experiments on Robot Dialogues Using a Cloud Robotics Platform",
In Proc. ACM/IEEE HRI,
Christchurch, New Zealand,
pp. 525-526,
2016.
-
S. Takeuchi, K. Sugiura, Y. Akahoshi, and K. Zettsu,
"Constrained Region Selection Method Based on Configuration Space for Visualization in Scientific Dataset Search,"
In Proc. IEEE Big Data, vol. 2,
pp. 2191-2200,
2015.
-
K. Sugiura and K. Zettsu,
"Rospeex: A Cloud Robotics Platform for Human-Robot Spoken Dialogues",
In Proc. IEEE/RSJ IROS,
pp. 6155-6160,
Hamburg, Germany,
Oct 1, 2015.
-
T. Nose, Y. Arao, T. Kobayashi, K. Sugiura, Y. Shiga, and A. Ito,
"Entropy-Based Sentence Selection for Speech Synthesis Using Phonetic and
Prosodic Contexts",
In Proc. Interspeech,
pp. 3491-3495,
Dresden, Germany,
Sep. 2015.
-
K. Lwin, K. Zettsu, and K. Sugiura,
"Geovisualization and Correlation Analysis between Geotagged Twitter and JMA Rainfall Data: Case of Heavy Rain Disaster in Hiroshima",
In Proc. Second IEEE International Conference on Spatial Data Mining and
Geographical Knowledge Services,
Fuzhou, China, July 2015.
-
B. T. Ong, K. Sugiura, and K. Zettsu,
"Dynamic Pre-training of Deep Recurrent Neural Networks for Predicting Environmental Monitoring Data,"
In Proc. IEEE Big Data 2014,
pp. 760-765,
Washington DC, USA,
Oct 30, 2014. (acceptance rate = 18.5%)
-
B. T. Ong, K. Sugiura, and K. Zettsu,
"Predicting PM2.5 Concentrations Using Deep Recurrent Neural Networks with Open Data,"
In Proc. iDB Workshop 2014,
Fukuoka, Japan,
July 31, 2014.
-
D. Holz, J. Ruiz-del-Solar, K. Sugiura, S. Wachsmuth,
"On RoboCup@Home - Past, Present and Future of a Scientific Competition for Service Robots",
In Proc. RoboCup Symposium,
pp. 686-697,
Joao Pessoa, Brazil,
July 25, 2014.
-
D. Holz, L. Iocchi, J. Ruiz-del-Solar, K. Sugiura, and T. van der Zant,
"RoboCup@Home | a competition as a testbed for domestic service robots,"
In Proc. 1st International Workshop on Intelligent Robot Assistants,
Padova, Italy,
July 15, 2014.
-
S. Takeuchi, Y. Akahoshi, B. T. Ong, K. Sugiura, and K. Zettsu,
"Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and
Heterogeneous Scientific Repositories,"
In Proc. 2014 IEEE International Congress on Big Data,
pp. 669-676,
Anchorage, USA,
July 1, 2014.
-
K. Sugiura, Y. Shiga, H. Kawai, T. Misu and C. Hori,
"Non-Monologue HMM-Based Speech Synthesis for Service Robots: A Cloud Robotics Approach,"
In Proc. IEEE ICRA,
pp.2237-2242.
Hong Kong, China,
June 3, 2014.
-
J. Tan, T. Inamura, K. Sugiura, T. Nagai, and H. Okada,
"Human-Robot Interaction between Virtual and Real Worlds: Motivation from RoboCup@Home,"
In Proc. International Conference on Social Robotics,
pp.239-248,
Bristol, UK,
Oct 27, 2013.
-
T. Inamura, J. Tan, K. Sugiura, T. Nagai, and H. Okada,
"Development of RoboCup@Home Simulation towards Long-term Large Scale
HRI,"
In Proc. RoboCup Symposium,
Eindhoven, The Netherlands,
July 1, 2013.
-
R. Lee, K. Kim, K. Sugiura, K. Zettsu, Y. Kidawara,
"Complementary Integration of Heterogeneous Crowd-sourced Datasets for
Enhanced Social Analytics,"
In Proc. IEEE MDM, vol. 2,
pp. 234-243,
Milan, Italy,
June 3, 2013.
-
K. Sugiura, R. Lee, H. Kashioka, K. Zettsu, and Y. Kidawara,
"Utterance Classification Using Linguistic and Non-Linguistic
Information for Network-Based Speech-To-Speech Translation Systems,"
In Proc. IEEE MDM, vol. 2,
pp. 212-216,
Milan, Italy,
June 3, 2013.
-
K. Sugiura, Y. Shiga, H. Kawai, T. Misu and C. Hori,
"Non-Monologue Speech Synthesis for Service Robots,"
In Proc. Fifth Workshop on Gaze in HRI,
Tokyo, Japan,
March 3, 2013.
-
K. Sugiura, N. Iwahashi and H. Kashioka,
"Motion Generation by Reference-Point-Dependent Trajectory HMMs,"
In Proc. IEEE/RSJ IROS,
pp.350-356,
San Francisco, USA,
September 25-30, 2011.
【IROS 2011 RoboCup Best Paper Award受賞】
-
T. Misu, K. Sugiura, K. Ohtake, C. Hori, H. Kashioka, H. Kawai and S. Nakamura,
"Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy",
In Proc. SIGDIAL,
pp.221-224,
2011.
-
T. Misu, K. Sugiura, K. Ohtake, C. Hori, H. Kashioka, H. Kawai and S. Nakamura,
"Dialogue Strategy Optimization to Assist User's Decision for Spoken
Consulting Dialogue Systems",
In Proc. IEEE-SLT, pp.342-347, 2010.
-
N. Iwahashi, K. Sugiura, R. Taguchi, T. Nagai, and T. Taniguchi,
"Robots That Learn to Communicate: A Developmental Approach to
Personally and Physically Situated Human-Robot Conversations",
In Proc. The 2010 AAAI Fall Symposium on Dialog with Robots,
pp. 38-43,
Arlington, Virginia, USA,
November 11-13, 2010.
-
K. Sugiura, N. Iwahashi, H. Kawai, and S. Nakamura,
"Active Learning for Generating Motion and Utterances in Object
Manipulation Dialogue Tasks",
In Proc. The 2010 AAAI Fall Symposium on Dialog with Robots,
pp. 115-120,
Arlington, Virginia, USA,
November 11-13, 2010.
-
K. Sugiura, N. Iwahashi, H. Kashioka, and S. Nakamura,
"Active Learning of Confidence Measure Function in Robot Language
Acquisition Framework",
In Proc. IEEE/RSJ IROS,
pp. 1774-1779,
Taipei, Taiwan,
Oct 18-22, 2010.
-
X. Zuo, N. Iwahashi, R. Taguchi, S. Matsuda, K. Sugiura,
K. Funakoshi, M. Nakano, and N. Oka,
"Detecting Robot-Directed Speech by Situated Understanding in Physical
Interaction",
In Proc. IEEE RO-MAN,
pp. 643-648,
2010.
-
M. Attamimi, A. Mizutani, T. Nakamura, K. Sugiura,
T. Nagai, N. Iwahashi, H. Okada, and T. Omori,
"Learning Novel Objects Using Out-of-Vocabulary Word Segmentation and
Object Extraction for Home Assistant Robots",
In Proc. IEEE ICRA,
pp. 745-750,
Anchorage, Alaska, USA,
May 3-8, 2010.
【2011年 ロボカップ研究賞受賞(ロボカップ日本委員会)】
This paper presents a method for learning novel
objects from audio-visual input. Objects are learned using
out-of-vocabulary word segmentation and object extraction.
The latter half of this paper is devoted to evaluations. We
propose the use of a task adopted from the RoboCup@Home
league as a standard evaluation for real world applications.
We have implemented proposed method on a real humanoid
robot and evaluated it through a task called ''Supermarket''.
The results reveal that our integrated system works well in the
real application. In fact, our robot outperformed the maximum
score obtained in RoboCup@Home 2009 competitions.
-
X. Zuo, N. Iwahashi, R. Taguchi, S. Matsuda, K. Sugiura,
K. Funakoshi, M. Nakano, and N. Oka,
"Robot-Directed Speech Detection Using Multimodal Semantic Confidence
Based on Speech, Image, and Motion",
In Proc. IEEE ICASSP,
pp. 2458-2461,
Dallas, Texas, USA,
March 14-19, 2010.
In this paper, we propose a novel method to detect robotdirected
(RD) speech that adopts the Multimodal Semantic Confidence
(MSC) measure. The MSC measure is used to decide whether
the speech can be interpreted as a feasible action under the current
physical situation in an object manipulation task. This measure
is calculated by integrating speech, image, and motion confidence
measures with weightings that are optimized by logistic regression.
Experimental results show that, compared with a baseline method
that uses speech confidence only, MSC achieved an absolute increase
of 5% for clean speech and 12% for noisy speech in terms of
average maximum F-measure.
-
T. Misu, K. Sugiura, T. Kawahara, K. Ohtake,
C. Hori, H. Kashioka, and S. Nakamura,
"Online Learning of Bayes Risk-Based Optimization of Dialogue
Management for Document Retrieval Systems with Speech Interface",
In Proc. IWSDS, 2009.
-
K. Sugiura, N. Iwahashi, H. Kashioka, and S. Nakamura,
"Bayesian Learning of Confidence Measure Function for Generation of
Utterances and Motions in Object Manipulation Dialogue Task",
In Proc. Interspeech, pp. 2483-2486,
Brighton, UK, September, 2009.
This paper proposes a method that generates motions and
utterances in an object manipulation dialogue task.
The proposed method integrates belief modules for speech,
vision, and motions into a probabilistic framework so that
a user's utterances can be understood based on multimodal information.
Responses to the utterances are optimized based on an
integrated confidence measure function for the integrated
belief modules.
Bayesian logistic regression is used for the learning of
the confidence measure function.
The experimental results revealed that the proposed method
reduced the failure rate from 12% down to 2.6% while the
rejection rate was less than 24%.
-
N. Iwahashi, R. Taguchi, K. Sugiura, K. Funakoshi, and
M. Nakano,
"Robots that Learn to Converse: Developmental Approach to Situated Language Processing",
In Proc. International Symposium on Speech and Language
Processing, pp. 532-537,
China, August, 2009.
-
K. Sugiura and N. Iwahashi,
"Motion Recognition and Generation by Combining
Reference-Point-Dependent Probabilistic Models",
In Proc. IEEE/RSJ IROS,
pp. 852-857, Nice, France, September, 2008.
This paper presents a method to recognize and generate
sequential motions for object manipulation such as placing
one object on another or rotating it.
Motions are learned using reference-point-dependent
probabilistic models, which are then transformed to the
same coordinate system and combined for motion
recognition/generation.
We conducted physical experiments in which a user
demonstrated the manipulation of puppets and toys, and
obtained a recognition accuracy of 63% for the sequential
motions.
Furthermore, the results of motion generation experiments
performed with a robot arm are presented.
-
K. Sugiura and N. Iwahashi,
"Learning Object-Manipulation Verbs for Human-Robot Communication",
In Proc. Workshop on Multimodal Interfaces in Semantic Interaction,
pp. 32-38, Nagoya, Japan, November, 2007.
This paper proposes a machine learning method for mapping
object-manipulation verbs with sensory inputs and motor
outputs that are grounded in the real world.
The method learns motion concepts demonstrated by a user and
generates a sequence of motions, using
reference-point-dependent probability models.
Here, the motion concepts are learned by using hidden Markov
models (HMMs).
In the motion generation phase, our method transforms and
combines HMMs to generate trajectories.
-
K. Sugiura, T. Nishikawa, M. Akahane, and O. Katai:
"Autonomous Design of a Line-Following Robot by Exploiting
Interaction between Sensory Morphology and Learning Controller",
In Proc. the 2nd Biomimetics International Conference, Doshisha,
pp. 23-24, Kyoto, Japan, December, 2006
In this paper, we propose a system that automatically designs the sensory morphology of
an adaptive robot. This system designs the sensory morphology in simulation with two kinds
of adaptation, ontogenetic adaptation and phylogenetic adaptation, to optimize the learning
ability of the robot.
-
K. Sugiura, D. Matsubara, and O. Katai: "Construction of
Robotic Body Schema by Extracting Temporal Information from Sensory
Inputs",
In Proc. SICE-ICASE,
pp. 302-307, Busan, Korea, October, 2006.
This paper proposes a method that incrementally develops the
"body schema" of a robot. The method has three features: 1)
estimation of light-sensor positions based on the Time Difference of
Arrival (TDOA) of signals and multidimensional scaling (MDS); 2)
incremental update of the estimation; and 3) no additional
equipment.
- M. Akahane, K. Sugiura, T. Shiose, H. Kawakami,
and O. Katai: "Autonomous Design of Robot Morphology for Learning
Behavior Using Evolutionary Computation",
In Proc. 2005 Japan-Australia
Workshop on Intelligent and Evolutionary Systems, Hakodate, Japan,
CD-ROM, 2005.
- K. Sugiura, M. Akahane, T. Shiose, K. Shimohara, and O. Katai:
"Exploiting Interaction between Sensory Morphology and Learning",
In Proc. IEEE-SMC,
Hawaii, USA, pp. 883-888, 2005.
This paper proposes a system that automatically designs
the sensory morphology of a line-following robot. The designed
robot outperforms hand-coded designs in learning speed and
accuracy.
-
K. Sugiura, T. Shiose, H. Kawakami, and O. Katai:
"Co-evolution of Sensors and Controllers",
In Proc. 2003 Asia Pacific Symposium on Intelligent and
Evolutionary Systems (IES2003), Kitakyushu, Japan, pp. 145-150, 2003.
In this paper we investigate the evolutionary development of embodied agents that are allowed
to evolve not only control mechanisms but also the sensitivity and temporal resolution of their sensors. The
experimental results indicate that the sensors and controller co-evolve in an agents through interacting with
the environments
-
K. Sugiura, H. Suzuki, T. Shiose, H. Kawakami, and O. Katai:
"Evolution of Rewriting Rule Sets Using String-Based Tierra",
In Proc. ECAL,
Dortmund, Germany, pp. 69-77, 2003.
We have studied a string rewriting system to improve the
basic design of an artificial life system named String-based Tierra.
The instruction set used in String-based Tierra is converted into a
set of rewriting rules using regular expressions.