iT邦幫忙

第 12 屆 iThome 鐵人賽

DAY 28
0
自我挑戰組

軟體開發隨筆雜記--試著解決問題系列 第 27

[Python]如何Speech to Text: SpeechRecognition

  • 分享至 

  • xImage
  •  

https://pypi.org/project/SpeechRecognition/

pip3 install SpeechRecognition
Collecting SpeechRecognition
  Downloading https://files.pythonhosted.org/packages/26/e1/7f5678cd94ec1234269d23756dbdaa4c8cfaed973412f88ae8adf7893a50/SpeechRecognition-3.8.1-py2.py3-none-any.whl (32.8MB)
     |████████████████████████████████| 32.8MB 6.4MB/s
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.8.1

Audio來源建議使用WAV檔案,無法使用mp3:
其支援檔案類型:
WAV: must be in PCM/LPCM format
AIFF
AIFF-C
FLAC: must be native FLAC format; OGG-FLAC is not supported

https://ithelp.ithome.com.tw/upload/images/20201011/2011960841GNqatoQj.jpg

import speech_recognition as sr
r = sr.Recognizer()
WAV = sr.AudioFile('OSR_us_000_0061_8k.wav')
with WAV as source:
    audio = r.record(source)
print(r.recognize_google(audio, show_all=True))

https://ithelp.ithome.com.tw/upload/images/20201011/20119608LRyh5nHwNi.jpg

MP3轉WAV

pip3 install pydub
Collecting pydub
  Downloading https://files.pythonhosted.org/packages/7b/d1/fbfa79371a8cd9bb15c2e3c480d7e6e340ed5cc55005174e16f48418333a/pydub-0.24.1-py2.py3-none-any.whl
Installing collected packages: pydub
Successfully installed pydub-0.24.1

Download ffmpeg

進入https://ffmpeg.org/download.html#build-windows
https://ithelp.ithome.com.tw/upload/images/20201013/2011960857qJNYbUmW.jpg
進入https://www.gyan.dev/ffmpeg/builds/
https://ithelp.ithome.com.tw/upload/images/20201013/20119608OdMPQ7I5FA.jpg
下載ffmpeg-4.3.1-2020-10-01-full_build.7z
https://ithelp.ithome.com.tw/upload/images/20201013/20119608YCYleeFfjk.jpg
解壓縮後將"ffmpeg-4.3.1-2020-10-01-full_build"改成"ffmpeg"放在C:\下,並修改環境變數
使用者
https://ithelp.ithome.com.tw/upload/images/20201013/20119608nMpM64lEzY.jpg
系統
https://ithelp.ithome.com.tw/upload/images/20201013/20119608EDhGkEurIJ.jpg
重新打開cmd
https://ithelp.ithome.com.tw/upload/images/20201013/201196081iGpaev9Zl.jpg
轉檔mp3->wav
pydubTest.py

from os import path
from pydub import AudioSegment
# convert wav to mp3                                                            
sound = AudioSegment.from_mp3("good.mp3")
sound.export("test.wav", format="wav")

辨識
test2.py

import speech_recognition as sr
r = sr.Recognizer()
WAV = sr.AudioFile('test.wav')
with WAV as source:
    audio = r.record(source)
print(r.recognize_google(audio, show_all=True))

https://ithelp.ithome.com.tw/upload/images/20201013/20119608yVonGht1YV.jpg


上一篇
[Python]如何Text to Speech: pyttsx3, gTTS
下一篇
[Python]Natural Language Toolkit
系列文
軟體開發隨筆雜記--試著解決問題33
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言