第 12 屆 iThome 鐵人賽

DAY 4

Microsoft Azure

利用Python開發一個以Azure服務為基底的Chat Bot系列第 4 篇

【Day04】Speech Service

12th鐵人賽 microsoft azure cognitive service cloud speech-to-text cloud text to speech

SDDCC

2020-09-19 18:46:34

5594 瀏覽

分享至

嗨，大家好，今天我想要分享 Speech Service 的操作心得，準備好讓你的電腦也能辨識你說的話了嗎?

Speech Service 可以做甚麼?

使用 Speech Service 就可以讓你的專案或產品實現語音相關的功能

大致上有 5 種功能:

語音轉文字 (Speech-to-text) : 可以提供即時以及非即時的轉換功能，也可以針對特定場景訓練客製化的模型，優化轉換成文字的結果。
文字轉語音 (Text-to-speech) : 除了微軟已經預先訓練好的聲音以外，也可以客製發出聲音的人聲。
語音翻譯 (Speech Translation) : 可讓應用程式、工具或裝置上實現多語言及時語音翻譯。
語音助理 (Voice Assistant) : 可結合 chatbot，讓chatbot會講話。
說話者辨識 (Speaker Recognition) : 可以辨識說話的聲音為哪一個人的一項服務。分為「說話者驗證」和「說話者辨識」。
- 驗證 : 說話者說出特定的一句話，藉由那句話來確認是否為特定的某個人。
- 識別 : 針對已經註冊過的人，可以自動辨識是誰在說話。這個服務可以用來辨識「有可能是誰說話」，但不能確定是否為同一人。

大家可以到以下 3 個網站玩玩看官方範例。

用 Python 程式碼簡單實作一個範例

前置步驟

0.1 準備好一個 Azure Account

0.2 建立好 Python 環境

0.3 打開 terminal (Powershell or CMD)，輸入以下指令

pip install azure.cognitiveservices.speech

在 Azure Portal 上建立 Speech Service

1.1 前往 Azure Portal，並搜尋 Speech Service

1.2 完成以下所要求的資訊

Name : <任意文字>
Subscription : <可以用的Subscription>
Location : <任意一個區域，建議選East US>
Pricing tier : <F0 為免費方案，你也可以選擇 S0，不過要注意費用>
Resource group : <任意一個 Resource Group，沒有的話創建一個新的>

1.3 待建立完成後，找到你的資源，複製 key + Location

寫一個簡單的小程式

2.1 打開 Visual Studio Code (VS Code)

2.2 新增一個 Python 程式檔，命名為speech-test.py，並將以下程式碼複製貼上

import azure.cognitiveservices.speech as speechsdk

# 複製貼上你的剛剛的參數
speech_key, service_region = "<貼上你的key>", "<貼上你所選的區域>"  ## 沒有 <> 這個符號

def STT(text="說出一句話"):
    #先config
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region, speech_recognition_language='zh-tw')
    #創建分辨器
    speech_recognizer = speechsdk.SpeechRecognizer(speech_config = speech_config)
    result = speech_recognizer.recognize_once()

    if result.reason == speechsdk.ResultReason.RecognizedSpeech:
        print("Recognized: {}".format(result.text))
        return result.text
    elif result.reason == speechsdk.ResultReason.NoMatch:
        print("No speech could be recognized: {}".format(result.no_match_details))
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        print("Speech Recognition canceled: {}".format(cancellation_details.reason))
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            print("Error details: {}".format(cancellation_details.error_details))

def TTS(text):
    print(text)
    #先config
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

    # The full list of supported languages can be found here:
    # https://docs.microsoft.com/azure/cognitive-services/speech-service/language-support#text-to-speech
    #設定語系或是設定指定人的聲音(擇一)
    
    # #特定語系
    language = "zh-TW"
    speech_config.speech_synthesis_language = language

    # #特定人聲
    # voice = "Microsoft Server Speech Text to Speech Voice (en-US, BenjaminRUS)"
    # speech_config.speech_synthesis_voice_name = voice

    # 創建語音合成器Creates a speech synthesizer using the default speaker as audio output.
    speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config = speech_config)


    # Synthesizes the received text to speech.
    # The synthesized speech is expected to be heard on the speaker with this line executed.
    result = speech_synthesizer.speak_text_async(text).get()

    # Checks result.
    if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        # print("Speech synthesized to speaker for text [{}]".format(text))
        pass
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        print("Speech synthesis canceled: {}".format(cancellation_details.reason))
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            if cancellation_details.error_details:
                print("Error details: {}".format(cancellation_details.error_details))
        print("Did you update the subscription info?")

2.3 若要測試「語音轉文字」，則複製貼上以下程式碼

#測試 speech to text
print("說一句話")
text = STT()

2.4 打開terminal，確認位置為程式檔案的位置後，輸入以下指令

python speech-test.py

2.5 若要測試「文字轉語音」，則複製貼上以下程式碼

#測試 text to speech
print("測試打一句話")
text = input()
TTS(text)

2.6 打開terminal，確認位置為程式檔案的位置後，輸入以下指令

python speech-test.py

以上是今天 Speech Service 的內容，明天將會為大家介紹有關搜尋引擎的 AI 服務喔~

【Day03】淺談 Cognitive Service

【Day05】Web Search Service

系列文

利用Python開發一個以Azure服務為基底的Chat Bot 共 30 篇

RSS系列文訂閱系列文

11 人訂閱

完整目錄

直播研討會

{{ item.channelVendor }} {{ item.webinarstarted }} |

直播中

尚未有邦友留言

立即登入留言

參賽組數

1064 組

團體組數

40 組

累計文章數

22200 篇

完賽人數

600 人

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# windows server linux css react vue.js

IT邦幫忙

利用Python開發一個以Azure服務為基底的Chat Bot系列 第 4 篇