ListenAI 发表于 2024-6-13 15:59

聆思CSK6大模型多模态语音交互开源SDK介绍


聆思CSK6大模型多模态SDK除了支持语音交互之外,还支持图像方面的交互,包括对图像的识别以及图片生成。用户可以通过语音交互、摄像头拍照作为交互入口,实现与大模型的多模态交互。SDK主要包含以下功能:●语音交互:支持按键录音或唤醒后通过语音与大模型进行对话●拍照识图:支持通过摄像头拍摄图像并上传给大模型进行识别,支持依据识图内容进行提问●图片生成:支持通过语音交互描述画面内容,令大模型生成图片并显示至套件屏幕上语音交互模式支持的语音交互模式多模态SDK支持三种交互方式,其特点如下:
模式唤醒方式交互方式
按键交互按下屏幕麦克风图标或开发板K3按键按住按键说话,松开提交
语音唤醒(单轮)唤醒词 “小美小美”听到提示音 “在呢” 后进行提问,每次提问均需要唤醒
语音唤醒(多轮)唤醒词 “小美小美”听到提示音 “在呢” 后进行提问,可持续对话,当超过20秒无语音输入时自动结束本次交互
语音交互模式的切换在待机页面,下滑可调出下滑菜单,点击下滑菜单中的 设置图标,可进行配置页面。选中对应的模式后,点击左上角data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAB8AAAAdCAYAAABSZrcyAAAILElEQVRIiUWX269dVRXGf2OMOdfe+5yeXujlQGnRYluVqwL1UkEwYAiJRowJMZr44oP+BcYXkyb+DcYXxZhg5MH7iyYaY4gXjAkRkRYtFGgPeno5pz33vdaacwwf1i4+rpmVOc**9/3Ddl/39djbpzp2inujqjQdQUCRAAVTBQzyAKqSt931L5QIxAzIpxwR0RRVSQUCEopQACCSsY9CAIRwcxITZojJaVWoBZEhJQU9zpcJkoQlFKHM3WEhOUMtc4uM1yCnBMRQS0VUcVSxr1SPVBVzIxaCh4OHqQgMd1piQCRRDio6BCQ1yEYMwjHS4eaIiJ4dVLO9F0PCCkJEeDVQQBATQkUCScYEhE1qOBA8hpEDVLOpJSo1XEPTJWQQlQnQokAJOE+tMNDCYcIRWQodamFQECHQCKgBogaHuC1EhGECBFBmsWIV+ipCEPgpVTUDNehdzllUpOoZYdsE3IaszPdYDyZ0PXT4TGMoMdn7aMMJageA3YsDdiIQIA0mcxT+g4kiBj6GjgOSICgmCmEgMBk935uPXA7D9y1yNFbnX+/dpUX/v461za74fLWUa1DBUSIEIQgQgZAixAMVUm1Ftz//6gImBlNk+lLRSWjlmhLCxY8dPJOnvnCYyweW0Cahkc/U7nnhb/w4x++yNK1lq1mk4yhKBFC6SuooJpnyI8BGxGkri/gjqogokTUAa2iiEAfBbwwmp/jo6dO8KXPP8j+/Qf4zo9e5h+XtvjUqX088+Qpsi7wk5//ibNvreDVhoergybwwCMw0dnYORGOzR06fWboTAAD2GJWbkkCVslzxgMP3cVTn3uYqxW++9yr/O1vK6wuXeDNV5a5dMM4+ehxjt13kNXLHTdWNui6QqAgCmpICO7DjEcACLZw+JEzCFQvw0EyUjOGPCY3xuEje3nis5/gnsfu518rPc89+xKv//MNxukaObbRApffKZx9fZnjp+/mQ5+6n35jk+Xla7SdENIjkciWAYZJMgURbOHII2dUFTVDVAfsi6F5xMm7j/HUZx7k5Mc+wCuXV/n183/h2sXrNOxAu4r0PaW2eNnkyvImS9dWueOeQzz08fezuHvE+sUbTLectg7cIARD0sO82+TW02fCA3cHhJwbTJX7ThzmK198mIP3H+FnL7zK73/5Etvnl5lzo4me6LahChaGSk8eC2vr61y6toYt7OHph+/j3sUFXju3weWdliaBuKBq77bWJoc+eoYYPnIakVLDntsW+fJXH+f4id08+4sX+e2vzhJvLjMpFfqK+hSVimkz4FocbEotlbXVyttv9mxuCx9/6iR9Fs6fvYz0FfdCrf8HdxJRCB84PAQhcezIQcaHD/CDP57jd794mdF6BVq66hAV95aUlMaUWnsiMqW0VOmQtRH9ziq/XnmJ2040nHryXl584Q3eOncRlaCNnpwNIlBLCTXDkqGmFBFs7Pi04w+/eYc6FZJNSZrwcaZkhWSIOF57JAuiSoggScjWoe0SurHE2b9eZETCJkHrlZCEaGLaVfoKqboTHliGZi6T53Yxv3iIA/MNc7LJ2IPwCSl22D2ao0jCS4sxxWNMWwq5bpNHgnmD1C26qox1LzoxFsbKpBlTJBG1UEJwDEuZRAwarSKUUik7U7rtKdr3tNtbBEFbKwjYVuXALWPuPL6HhV3zaBoTXc/21RXOXVjh+rSnL45ag0rCXQgXRAybaX1ulDoTmBQRuAdFgpSCWgo7W1tYwDiPCIFIRo+wE3DqgaN8/WuP0LXKTls4OB6h/Qbf+vZPefvCKjppyKWh9MLGxg61Bm3b03Y9I2NGMkG4k1JKM751ck7Mzy+QmwbLxngygZm8jlVIfeJ9R25Hd+/ne99/lZffXOZj97yXrz59JwduO8r4/DpJHJeG8WiOfXv3Y2qMRnPknKF29KUMlU42k9QIDOi7liJTqgetO1E6mpzwUhDpGO/ax2iSWL96lddfvch/L65yoTG2tm5hfi+M8ohCpRMnast0sx0Sc8frQK2IUN3JKZH6vpBVUAoUpeZCDWXLKnO7nBJBlr2QYG2k3GAKUfB+lfnuCtFPmNYdbugajEdUn9JLj+SWXfMjNk3YMh3MRKn44EqobSG5OxXBbABEhNO2O8wtjHj80yfZvHKJ1CnhE2J3YnHfLqIP+rYlSpBHE0QmHH/P7VxfbNk1n6ghpPkJTzxyK13bc31jjeIdMtPzwQkFSVWxlAiBaddTyyZvn/8Pf/7DBU6fPsnRQwtIZ4xCacbz7D+8j3+df4NN36SjsLZ8nUtvrfDYJz/CQyc+yGjc0ZcGSyMWDo75yfPnuHx2BZkhXERQ1cEdH/jwN0LVEBFKOCFQbERzyyGOHsvMLyziNbNrsshUlNKtsrZ0gaX/XGGy3pHTmIUju9l9x+0U9jKaCHQZ77bYXl9i6Y11Nq9fQWsBH8T7JuLl0IPfDBB8OEUkiDCi7qHYFlX30KNojMAcYYdxmeLSMmkrNTW0avRF8DKPSCEsIAoWjlHwWIdqBEr4oGzhMdiomzoe7ng4GhXzG6hXkEITjpkSpcejIgGqQRuBekV7ZxxArCEzdYwQCH1XwSqFIIgqqCbUlDQAQDBVipchABGc7cFx+HRwoAE+WDA8bi4BCXxAserMdlXHZNhiYLZUqED12agFIkGT88BwZkbf9+/SngrUWlCdsdHge6izngHUWil9waTB3RmY0me0ETPXMnwjMjOnhohSSgW6IfNaK7VW3B0zQ4CcmpnFHUiiVgdiNo5xk5uImTbcDMDM3gXUzX9jFoB7IAxrVcqZ/wG6sdiYMh1lNwAAAABJRU5ErkJggg==即可回到待机页面并生效。按键交互模式设置成按键交互(按键唤醒)模式下,按住屏幕上的麦克风按钮或开发板上的K3按键,即进入录音状态,松开按键则结束录音并提交。语音唤醒模式当设置为语音唤醒(单轮)或语音唤醒(多轮),可通过唤醒词 —— “小美小美” 对设备进行唤醒,当听到 “在呢” 的提示音后,即可正常进行语音输入。退出对话在使用过程中,点击左上角data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAB8AAAAdCAYAAABSZrcyAAAILElEQVRIiUWX269dVRXGf2OMOdfe+5yeXujlQGnRYluVqwL1UkEwYAiJRowJMZr44oP+BcYXkyb+DcYXxZhg5MH7iyYaY4gXjAkRkRYtFGgPeno5pz33vdaacwwf1i4+rpmVOc**9/3Ddl/39djbpzp2inujqjQdQUCRAAVTBQzyAKqSt931L5QIxAzIpxwR0RRVSQUCEopQACCSsY9CAIRwcxITZojJaVWoBZEhJQU9zpcJkoQlFKHM3WEhOUMtc4uM1yCnBMRQS0VUcVSxr1SPVBVzIxaCh4OHqQgMd1piQCRRDio6BCQ1yEYMwjHS4eaIiJ4dVLO9F0PCCkJEeDVQQBATQkUCScYEhE1qOBA8hpEDVLOpJSo1XEPTJWQQlQnQokAJOE+tMNDCYcIRWQodamFQECHQCKgBogaHuC1EhGECBFBmsWIV+ipCEPgpVTUDNehdzllUpOoZYdsE3IaszPdYDyZ0PXT4TGMoMdn7aMMJageA3YsDdiIQIA0mcxT+g4kiBj6GjgOSICgmCmEgMBk935uPXA7D9y1yNFbnX+/dpUX/v461za74fLWUa1DBUSIEIQgQgZAixAMVUm1Ftz//6gImBlNk+lLRSWjlmhLCxY8dPJOnvnCYyweW0Cahkc/U7nnhb/w4x++yNK1lq1mk4yhKBFC6SuooJpnyI8BGxGkri/gjqogokTUAa2iiEAfBbwwmp/jo6dO8KXPP8j+/Qf4zo9e5h+XtvjUqX088+Qpsi7wk5//ibNvreDVhoergybwwCMw0dnYORGOzR06fWboTAAD2GJWbkkCVslzxgMP3cVTn3uYqxW++9yr/O1vK6wuXeDNV5a5dMM4+ehxjt13kNXLHTdWNui6QqAgCmpICO7DjEcACLZw+JEzCFQvw0EyUjOGPCY3xuEje3nis5/gnsfu518rPc89+xKv//MNxukaObbRApffKZx9fZnjp+/mQ5+6n35jk+Xla7SdENIjkciWAYZJMgURbOHII2dUFTVDVAfsi6F5xMm7j/HUZx7k5Mc+wCuXV/n183/h2sXrNOxAu4r0PaW2eNnkyvImS9dWueOeQzz08fezuHvE+sUbTLectg7cIARD0sO82+TW02fCA3cHhJwbTJX7ThzmK198mIP3H+FnL7zK73/5Etvnl5lzo4me6LahChaGSk8eC2vr61y6toYt7OHph+/j3sUFXju3weWdliaBuKBq77bWJoc+eoYYPnIakVLDntsW+fJXH+f4id08+4sX+e2vzhJvLjMpFfqK+hSVimkz4FocbEotlbXVyttv9mxuCx9/6iR9Fs6fvYz0FfdCrf8HdxJRCB84PAQhcezIQcaHD/CDP57jd794mdF6BVq66hAV95aUlMaUWnsiMqW0VOmQtRH9ziq/XnmJ2040nHryXl584Q3eOncRlaCNnpwNIlBLCTXDkqGmFBFs7Pi04w+/eYc6FZJNSZrwcaZkhWSIOF57JAuiSoggScjWoe0SurHE2b9eZETCJkHrlZCEaGLaVfoKqboTHliGZi6T53Yxv3iIA/MNc7LJ2IPwCSl22D2ao0jCS4sxxWNMWwq5bpNHgnmD1C26qox1LzoxFsbKpBlTJBG1UEJwDEuZRAwarSKUUik7U7rtKdr3tNtbBEFbKwjYVuXALWPuPL6HhV3zaBoTXc/21RXOXVjh+rSnL45ag0rCXQgXRAybaX1ulDoTmBQRuAdFgpSCWgo7W1tYwDiPCIFIRo+wE3DqgaN8/WuP0LXKTls4OB6h/Qbf+vZPefvCKjppyKWh9MLGxg61Bm3b03Y9I2NGMkG4k1JKM751ck7Mzy+QmwbLxngygZm8jlVIfeJ9R25Hd+/ne99/lZffXOZj97yXrz59JwduO8r4/DpJHJeG8WiOfXv3Y2qMRnPknKF29KUMlU42k9QIDOi7liJTqgetO1E6mpzwUhDpGO/ax2iSWL96lddfvch/L65yoTG2tm5hfi+M8ohCpRMnast0sx0Sc8frQK2IUN3JKZH6vpBVUAoUpeZCDWXLKnO7nBJBlr2QYG2k3GAKUfB+lfnuCtFPmNYdbugajEdUn9JLj+SWXfMjNk3YMh3MRKn44EqobSG5OxXBbABEhNO2O8wtjHj80yfZvHKJ1CnhE2J3YnHfLqIP+rYlSpBHE0QmHH/P7VxfbNk1n6ghpPkJTzxyK13bc31jjeIdMtPzwQkFSVWxlAiBaddTyyZvn/8Pf/7DBU6fPsnRQwtIZ4xCacbz7D+8j3+df4NN36SjsLZ8nUtvrfDYJz/CQyc+yGjc0ZcGSyMWDo75yfPnuHx2BZkhXERQ1cEdH/jwN0LVEBFKOCFQbERzyyGOHsvMLyziNbNrsshUlNKtsrZ0gaX/XGGy3pHTmIUju9l9x+0U9jKaCHQZ77bYXl9i6Y11Nq9fQWsBH8T7JuLl0IPfDBB8OEUkiDCi7qHYFlX30KNojMAcYYdxmeLSMmkrNTW0avRF8DKPSCEsIAoWjlHwWIdqBEr4oGzhMdiomzoe7ng4GhXzG6hXkEITjpkSpcejIgGqQRuBekV7ZxxArCEzdYwQCH1XwSqFIIgqqCbUlDQAQDBVipchABGc7cFx+HRwoAE+WDA8bi4BCXxAserMdlXHZNhiYLZUqED12agFIkGT88BwZkbf9+/SngrUWlCdsdHge6izngHUWil9waTB3RmY0me0ETPXMnwjMjOnhohSSgW6IfNaK7VW3B0zQ4CcmpnFHUiiVgdiNo5xk5uImTbcDMDM3gXUzX9jFoB7IAxrVcqZ/wG6sdiYMh1lNwAAAABJRU5ErkJggg==即可结束本轮对话回到待机页面,此操作会同步清除本次对话的上下文信息。拍照识图在待机页,点击拍照按钮即可进入取景页面,对准要拍照的物体,点击右侧中间的拍照键完成抓拍,确认画面抓拍正常后(无晃动模糊的情况),点击右侧的 √ 进行提交识别。文生图在设备进入语音交互状态后,可以通过带有绘画意图的提示词让大模型进行作画,比如:●“画一只熊猫”(结果看下方图片附件)SDK资源下载语音视觉大模型开发板 SDK:https://cloud.listenai.com/CSKG962172/duomotai_ap/-/tree/master/DEMO固件下载:https://docs2.listenai.com/x/UzjbjIAxw


页: [1]
查看完整版本: 聆思CSK6大模型多模态语音交互开源SDK介绍