KOMEI SUGIURA

Latest

Deep Flare Net

Solar flare is one of the causes of electromagnetic interference and affects aircrafts' routes. We developed Deep Flare Net (DeFN) based on ResNet and achieved the world's highest performance. The experimental results are shown in this paper.

We published the source code of Deep Flare Net. Please use the following git command to download it.

$ git clone https://github.com/komeisugiura/defn18.git

The package contains a readme file, which explains how to reprodoce the results. For more information, please take a look at this GitHub page.

pagetop

Software

Rospeex On-Premise

Rospeex is a cloud-based speech communication toolkit for ROS (Robot Operating System). It supports speech recognition and speech synthesis in 10 languages. You can write a simple dialogue function with only 10 lines of codes in Python/C++.

Rospeex has two versions: Rospeex On-Cloud and Rospeex On-Premise. Rospeex On-Cloud was available for free from September 2013 to September 2018, and was used by over 50,000 unique users. Rospeex On-Premise is for the cases where network is unavailable or end-users data are not allowed to be sent via internet.
We issue its license through third-party companies, please contact us if you are interested.

Licensed users can connect Rospeex servers with the source code available at this Bitbucket page.

pagetop

Sentence Generator 2010 for the General Purpose Service Robots Test

In the GPSR test, the order of the task is not predefined. The task is randomly given on site as a speech command, which is a complex sentence. The sentence generator 2010 generates random commands according to defined grammar.

Examples:

  • Go to the back door, grasp the chips, and bring it to the armchair.
  • Go to the dining table, introduce yourself, and leave the apartment.
  • Find a person, bring the yoghurt from the closet, and leave the apartment.
download
pagetop

Data sets

RoboCup 2011 Istanbul Noise Database

These databases can be used for testing your robot's speech recognition system. Play these files at 75dB (very noisy) and you can simulate noise conditions at typical RoboCup@Home environments.

Who Is Who (1h42m, 188MB)
download
Enhanced Who Is Who (1h44m, 192MB)
download
Shopping Mall (0h28m, 52MB)
download
pagetop

Sentence Generator 2010 for the General Purpose Service Robots Test

In the GPSR test, the order of the task is not predefined. The task is randomly given on site as a speech command, which is a complex sentence. The sentence generator 2010 generates random commands according to defined grammar.

Examples:

  • Go to the back door, grasp the chips, and bring it to the armchair.
  • Go to the dining table, introduce yourself, and leave the apartment.
  • Find a person, bring the yoghurt from the closet, and leave the apartment.
download
pagetop

Archive (discontinued)

Speech recognition without rospeex

Our cloud-based speech recognition service is also available for non-ROS users.

  • Academic use only. If you'd like to use this service for commercial purpose, please contact me for licensing information.
  • Absolutely no warranty.

Sample code in C++, Sample code in Python

# -*- coding: utf-8 -*-
"""
Usage: python sample.py input.wav
"""
import sys
import base64
import json
import urllib2

# Cloud-based speech recognition URL
URL ='http://rospeex.nict.go.jp/nauth_json/jsServices/VoiceTraSR'

def read_wavfile(filename):
    with open(filename,'rb') as rf:
        wav = rf.read()
    return wav

def post_to_recognizer(wav):
    buf = base64.b64encode(wav)
    json_data = { "method":"recognize",
                  "params":( "ja",
                             {"audio":buf, "audioType":"audio/x-wav", "voiceType":"*" } ) }
    json_obj = json.dumps(json_data)
    req = urllib2.Request(URL, json_obj)
    cont = urllib2.urlopen(req).read()
    return cont

def print_text(json_str):
    json_obj = json.loads(json_str)
    print json_obj['result'].encode('utf-8')

if __name__=='__main__':
    argv = sys.argv
    wav = read_wavfile(argv[1])
    recognition_result = post_to_recognizer(wav)
    print_text(recognition_result)

pagetop

Non-monologue speech synthesis for service robots

You can try our cloud-based speech synthesis system here.

  • INCOMPATIBLE with IE/Safari. Compatible with Firefox and Google Chrome.
  • Non-commercial use only.
  • Absolutely no warranty.

Sample code in C++ and Python:

#!/usr/bin/env python2
# coding: utf-8
"""
Python2.7 sample code for rospeex TTS
"""
import base64
import urllib2
import json

URL = 'http://rospeex.nict.go.jp/nauth_json/jsServices/VoiceTraSS'

def main():
    databody = {"method": "speak",
                "params": ["1.1",
                          {"language": "ja", "text": "こんにちは", "voiceType": "*", "audioType": "audio/x-wav"}]}
    request = urllib2.Request(URL, json.dumps(databody))
    response = urllib2.urlopen(request).read()
    tmp = json.loads(response)['result']['audio']
    wav = base64.decodestring(tmp.encode('utf-8'))

    with open("out.wav", "wb") as f:
        f.write(wav)


if __name__ == "__main__":
    main()

pagetop

iPhone App "Kyo no Osusume"

Discover your own Kyoto in a unique way with Kyo-no-Osusume! Just let it know what you feel like and/or what about Kyoto you want to experience. It picks recommended destinations for you based on a questionnaire database from 4000 people.

pagetop