TataJen的个人空间 https://passport2.21ic.com/?1162358 [收藏] [复制] [RSS]

日志

ROS中PocketSphinx语音识别_安装arbotix simulator仿真环境_turtlebot的仿真语音控制 ...

已有 654 次阅读2017-3-8 23:51 |系统分类:兴趣爱好| 语音识别


文章转自:周学伟



也可以参考:



http://www.pirobot.org/blog/0022/

blog.csdn.net/hanshuning/article/details/53731002

  • 利用PocketSphinx实现语音识别
  • 利用语音命令来控制Turtlebot
  • 实现播放语音

PocketSphinx语音识别

此种方法安装的Pocketsphinx语音识别准确率差,因为Pocketsphinx的版本太低,安装高版本的Pocketsphinx参考:

http://blog.csdn.net/hanshuning/article/details/53729304




  1. 安装PocketSphinx

$ sudo apt-get install gstreamer0.10-pocketsphinx
$ sudo apt-get install ros-indigo-pocketsphinx
$ sudo apt-get install ros-indigo-audio-common
$ sudo apt-get install libasound2
$ sudo apt-get install gstreamer0.10-gconf
(有些书本没有说要安装,但经过在indigo版本测试,必须安装)我的系统是hydro,也装了,没有测试不装行不行。


  1. 测试PocketSphinx 语音识别


  • 设置麦克风设备,系统设置->sound中input设置内置语音音量, 插入麦克风设备,在系统设置里测试麦克风是否有语音输入

    请输入图片描述



  • 启动launch文件:



$ roslaunch pocketsphinx robocup.launch


  • 尝试说一些简单的语句,当然,必须是英语,例如:bring me the glass,come with me,看看能不能识别出来



  • 直接看ROS最后发布的结果消息:



$ rostopic echo /recognizer/output

  • 效果图:

    请输入图片描述


  1. 添加语音库

  • 这个语音识别是属于离线识别,将一些常用的词汇放到一个文件中,作为识别的文本库,然后分段识别语音信号,最后在库中搜索对应的文本信息。如果想看语音识别库中有哪些文本信息,可以通过下面的指令进行查询:

$ roscd pocketsphinx/demo
$ more robocup.corpus



  • 添加语音库。 我们可以自己向语音库中添加其他的文本识别信息《ros by example》自带的例程中是带有语音识别的例程的,而且有添加语音库的例子。



  • 安装《ros by example》例子环境依赖(因为我的系统是hydro,github上并没有rbx1-prereq.sh文件,所以并没有安装这个依赖。可查看https://github.com/pirobot/rbx1/tree/hydro-devel



$cd ~
$wget https://raw.githubusercontent.com/pirobot/rbx1/indigo-devel/rbx1-prereq.sh
$sh rbx1-prereq.sh


  • 安装例子代码

$cd ~/catkin_ws/src工作空间的名字根据个人而不同
$git clone https://github.com/pirobot/rbx1.git
$cd rbx1
$git checkout indigo-devel
这个地方改为hydro-devel
$cd ~/catkin_ws
$catkin_make
$source ~/catkin_ws/devel/setup.bash
$rospack profile


  • 首先看看例子中要添加的文本息:

$ roscd rbx1_speech/config
$ more nav_commands.txt


请输入图片描述



  • 以下是需要添加的文本,我们也可以修改其中的某些文本,改成自己需要的。然后我们要把这个文件在线生成语音信息和库文件,这一步需要登陆网站http://www.speech.cs.cmu.edu/tools/lmtool-new.html,根据网站的提示上传文件,然后在线编译生成库文件。

    请输入图片描述



  • 把下载的文件都解压放在rbx1_speech包的config文件夹下。我们可以给这些文件改个名字



$ roscd rbx1_speech/config  
$ rename -f 's/3026/nav_commands/' *
 (运行了没见反应,不过能理解,config下本身就有转化好的,只是用于测试是可以的

  • 在rbx1_speech/launch文件夹下看看voice_nav_commands.launch这个文件

<launch>  
<node name="recognizer" pkg="pocketsphinx" type="recognizer.py"  
output="screen">  
<param name="lm" value="$(find rbx1_speech)/config/nav_commands.lm"/>  
<param name="dict" value="$(find rbx1_speech)/config/nav_commands.dic"/>  
</node>  
</launch>

  • 这个launch文件在运行recognizer.py节点的时候使用了我们生成的语音识别库和文件参数,这样就可以实用我们自己的语音库来进行语音识别了。通过之前的命令来测试一下效果如何吧:

$ roslaunch rbx1_speech voice_nav_commands.launch  
$ rostopic echo /recognizer/output  


语音控制turtlrbot机器人移动
1. recognizer.py会将最后识别的文本信息通过消息发布, 

 那么我们来编写一个机器人控制节点接收这个消息,进行相应的控制即可。
在pocketsphinx包中本身有一个语音控制发布Twist消息的例程voice_cmd_vel.py,rbx1_speech包对其进行了一些简化修改,在nodes文件夹里可以查看voice_nav.py文件


#!/usr/bin/env python

"""
  voice_nav.py - Version 1.1 2013-12-20
  
  Allows controlling a mobile base using simple speech commands.
  
  Based on the voice_cmd_vel.py script by Michael Ferguson in
  the pocketsphinx ROS package.
  
  See http://www.ros.org/wiki/pocketsphinx
"""

import rospy
from geometry_msgs.msg import Twist
from std_msgs.msg import String
from math import copysign

class VoiceNav:
    def __init__(self):
        rospy.init_node('voice_nav')
        
        rospy.on_shutdown(self.cleanup)
        
        # Set a number of parameters affecting the robot's speed
        self.max_speed = rospy.get_param("~max_speed", 0.4)
        self.max_angular_speed = rospy.get_param("~max_angular_speed", 1.5)
        self.speed = rospy.get_param("~start_speed", 0.1)
        self.angular_speed = rospy.get_param("~start_angular_speed", 0.5)
        self.linear_increment = rospy.get_param("~linear_increment", 0.05)
        self.angular_increment = rospy.get_param("~angular_increment", 0.4)
        
        # We don't have to run the script very fast
        self.rate = rospy.get_param("~rate", 5)
        r = rospy.Rate(self.rate)
        
        # A flag to determine whether or not voice control is paused
        self.paused = False
        
        # Initialize the Twist message we will publish.
        self.cmd_vel = Twist()

        # Publish the Twist message to the cmd_vel topic
        self.cmd_vel_pub = rospy.Publisher('cmd_vel', Twist, queue_size=5)
        
        # Subscribe to the /recognizer/output topic to receive voice commands.
        rospy.Subscriber('/recognizer/output', String, self.speech_callback)
        
        # A mapping from keywords or phrases to commands
        self.keywords_to_command = {'stop': ['stop', 'halt', 'abort', 'kill', 'panic', 'off', 'freeze', 'shut down', 'turn off', 'help', 'help me'],
                                    'slower': ['slow down', 'slower'],
                                    'faster': ['speed up', 'faster'],
                                    'forward': ['forward', 'ahead', 'straight'],
                                    'backward': ['back', 'backward', 'back up'],
                                    'rotate left': ['rotate left'],
                                    'rotate right': ['rotate right'],
                                    'turn left': ['turn left'],
                                    'turn right': ['turn right'],
                                    'quarter': ['quarter speed'],
                                    'half': ['half speed'],
                                    'full': ['full speed'],
                                    'pause': ['pause speech'],
                                    'continue': ['continue speech']}
        
        rospy.loginfo("Ready to receive voice commands")
        
        # We have to keep publishing the cmd_vel message if we want the robot to keep moving.
        while not rospy.is_shutdown():
            self.cmd_vel_pub.publish(self.cmd_vel)
            r.sleep()                       
            
    def get_command(self, data):
        # Attempt to match the recognized word or phrase to the
        # keywords_to_command dictionary and return the appropriate
        # command
        for (command, keywords) in self.keywords_to_command.iteritems():
            for word in keywords:
                if data.find(word) > -1:
                    return command
        
    def speech_callback(self, msg):
        # Get the motion command from the recognized phrase
        command = self.get_command(msg.data)
        
        # Log the command to the screen
        rospy.loginfo("Command: " + str(command))
        
        # If the user has asked to pause/continue voice control,
        # set the flag accordingly
        if command == 'pause':
            self.paused = True
        elif command == 'continue':
            self.paused = False
        
        # If voice control is paused, simply return without
        # performing any action
        if self.paused:
            return      
        
        # The list of if-then statements should be fairly
        # self-explanatory
        if command == 'forward':   
            self.cmd_vel.linear.x = self.speed
            self.cmd_vel.angular.z = 0
            
        elif command == 'rotate left':
            self.cmd_vel.linear.x = 0
            self.cmd_vel.angular.z = self.angular_speed
               
        elif command == 'rotate right':  
            self.cmd_vel.linear.x = 0      
            self.cmd_vel.angular.z = -self.angular_speed
            
        elif command == 'turn left':
            if self.cmd_vel.linear.x != 0:
                self.cmd_vel.angular.z += self.angular_increment
            else:        
                self.cmd_vel.angular.z = self.angular_speed
               
        elif command == 'turn right':   
            if self.cmd_vel.linear.x != 0:
                self.cmd_vel.angular.z -= self.angular_increment
            else:        
                self.cmd_vel.angular.z = -self.angular_speed
               
        elif command == 'backward':
            self.cmd_vel.linear.x = -self.speed
            self.cmd_vel.angular.z = 0
            
        elif command == 'stop':
            # Stop the robot!  Publish a Twist message consisting of all zeros.         
            self.cmd_vel = Twist()
        
        elif command == 'faster':
            self.speed += self.linear_increment
            self.angular_speed += self.angular_increment
            if self.cmd_vel.linear.x != 0:
                self.cmd_vel.linear.x += copysign(self.linear_increment, self.cmd_vel.linear.x)
            if self.cmd_vel.angular.z != 0:
                self.cmd_vel.angular.z += copysign(self.angular_increment, self.cmd_vel.angular.z)
            
        elif command == 'slower':
            self.speed -= self.linear_increment
            self.angular_speed -= self.angular_increment
            if self.cmd_vel.linear.x != 0:
                self.cmd_vel.linear.x -= copysign(self.linear_increment, self.cmd_vel.linear.x)
            if self.cmd_vel.angular.z != 0:
                self.cmd_vel.angular.z -= copysign(self.angular_increment, self.cmd_vel.angular.z)
               
        elif command in ['quarter', 'half', 'full']:
            if command == 'quarter':
                self.speed = copysign(self.max_speed / 4, self.speed)
        
            elif command == 'half':
                self.speed = copysign(self.max_speed / 2, self.speed)
            
            elif command == 'full':
                self.speed = copysign(self.max_speed, self.speed)
            
            if self.cmd_vel.linear.x != 0:
                self.cmd_vel.linear.x = copysign(self.speed, self.cmd_vel.linear.x)

            if self.cmd_vel.angular.z != 0:
                self.cmd_vel.angular.z = copysign(self.angular_speed, self.cmd_vel.angular.z)
               
        else:
            return

        self.cmd_vel.linear.x = min(self.max_speed, max(-self.max_speed, self.cmd_vel.linear.x))
        self.cmd_vel.angular.z = min(self.max_angular_speed, max(-self.max_angular_speed, self.cmd_vel.angular.z))

    def cleanup(self):
        # When shutting down be sure to stop the robot!
        twist = Twist()
        self.cmd_vel_pub.publish(twist)
        rospy.sleep(1)

if __name__=="__main__":
    try:
        VoiceNav()
        rospy.spin()
    except rospy.ROSInterruptException:
        rospy.loginfo("Voice navigation terminated.")


  1. 仿真测试

$ roslaunch rbx1_bringup fake_turtlebot.launch            首先是运行一个机器人模型:如果没有装arbotix simulator仿真环境,会报错安装参考http://www.cnblogs.com/zxouxuewei/p/5249935.html
$ rosrun rviz rviz -d `rospack find rbx1_nav`/sim.rviz    然后打开rviz:
$ roslaunch rbx1_speech voice_nav_commands.launch         再打开语音识别的节点:
$ roslaunch rbx1_speech turtlebot_voice_nav.launch        最后就是机器人的控制节点了:




  1. 效果图,不过语音的准确度还是有欠缺。

请输入图片描述



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
PS:安装arbotix simulator仿真环境
1.安装simulator
sudo apt-get install ros-indigo-arbotix-*

注意:一定要删除任何早期版本的arbotix依赖。

rospack profile
2. 为了确保一切工作,确保roscore运行,然后启动模拟turtlebot如下
复制代码
roslaunch rbx1_bringup fake_turtlebot.launch 3.使用一个模型的pi-机器人,运行命令:
roslaunch rbx1_bringup fake_pi_robot.launch
4.接下来,把rviz所以我们可以观察模拟机器人在行动:
rosrun rviz rviz -d `rospack find rbx1_nav`/sim.rviz
5.发布消息,使机器人移动。要测试模拟,打开另一个终端窗口,并运行以下要使模拟机器人在逆时针方向移动:
rostopic pub -r 10 /cmd_vel geometry_msgs/Twist '{linear: {x: 0.2, y: 0, z: 0}, angular: {x: 0, y: 0, z: 0.5}}'
6.需要停止机器人 ,在同一个终端窗口式Ctrl-C,或者发布空Twist message:
rostopic pub -1 /cmd_vel geometry_msgs/Twist '{}'
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%



实现播放语音

  1. 可以通过语音控制Turtlebot, 也可以让Turtlebot实现播放相应的文字内容,运行下面的命令:

$ rosrun sound_play soundplay_node.py  
$ rosrun sound_play say.py "Greetings Humans. Take me to your leader."  



  1. ROS通过识别我们输入的文本,让机器人读了出来。发出这个声音的人叫做kal_diphone,我们也可以换一个人来读

$ sudo apt-get install festvox-don  (下文中的talkback.py文件会用到)
$ rosrun sound_play say.py "Welcome to the future" voice_don_diphone



  1. 在rbx1_speech/nodes文件夹中有一个让机器人说话的节点talkback.py

#!/usr/bin/env python  
  
"""
    talkback.py - Version 0.1 2012-01-10
     
    Use the sound_play client to say back what is heard by the pocketsphinx recognizer.
     
    Created for the Pi Robot Project: http://www.pirobot.org
    Copyright (c) 2012 Patrick Goebel.  All rights reserved.

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.5
     
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details at:
     
    http://www.gnu.org/licenses/gpl.htmlPoint
"""  
  
import roslib; roslib.load_manifest('rbx1_speech')  
import rospy  
from std_msgs.msg import String  
from sound_play.libsoundplay import SoundClient  
import sys  
  
class TalkBack:  
    def __init__(self, script_path):  
        rospy.init_node('talkback')  
  
        rospy.on_shutdown(self.cleanup)  
         
        # Set the default TTS voice to use  
        self.voice = rospy.get_param("~voice", "voice_don_diphone")  
         
        # Set the wave file path if used  
        self.wavepath = rospy.get_param("~wavepath", script_path + "/../sounds")  
         
        # Create the sound client object  
        self.soundhandle = SoundClient()  
         
        # Wait a moment to let the client connect to the  
        # sound_play server  
        rospy.sleep(1)  
         
        # Make sure any lingering sound_play processes are stopped.  
        self.soundhandle.stopAll()  
         
        # Announce that we are ready for input  
        self.soundhandle.playWave(self.wavepath + "/R2D2a.wav")  
        rospy.sleep(1)  
        self.soundhandle.say("Ready", self.voice)  
         
        rospy.loginfo("Say one of the navigation commands...")  
  
        # Subscribe to the recognizer output and set the callback function  
        rospy.Subscriber('/recognizer/output', String, self.talkback)  
         
    def talkback(self, msg):  
        # Print the recognized words on the screen  
        rospy.loginfo(msg.data)  
         
        # Speak the recognized words in the selected voice  
        self.soundhandle.say(msg.data, self.voice)  
         
        # Uncomment to play one of the built-in sounds  
        #rospy.sleep(2)  
        #self.soundhandle.play(5)  
         
        # Uncomment to play a wave file  
        #rospy.sleep(2)  
        #self.soundhandle.playWave(self.wavepath + "/R2D2a.wav")  
  
    def cleanup(self):  
        self.soundhandle.stopAll()  
        rospy.loginfo("Shutting down talkback node...")  
  
if __name__=="__main__":  
    try:  
        TalkBack(sys.path[0])  
        rospy.spin()  
    except rospy.ROSInterruptException:  
        rospy.loginfo("Talkback node terminated.")


  1. 运行talkback.launch

$ roslaunch rbx1_speech talkback.launch

路过

鸡蛋

鲜花

握手

雷人

评论 (0 个评论)