iT邦幫忙

2018 iT 邦幫忙鐵人賽
DAY 18
0
AI & Machine Learning

利用 MS Bot framework 與 Cognitive Service 建構自用智慧小秘書系列 第 18

18.應用: 自用圖片文字辨識機器人

前言

看完了上一篇介紹與程式撰寫的部分所要處理的內容,了解需要分別呼叫 recognizeText 與 textOperation,才得以取的圖片中文字內容,如下圖所示:
https://d2mxuefqeaa7sj.cloudfront.net/s_C8D520B137E92BEF2298F7A94BC31FCFB451D16176D367CCC83252AC233096B6_1513911174540_image.png

而實際在實作圖片文字辨識機器人程式的過程中,需要額外處理一個步驟:因為透過通訊軟體傳遞圖片,我們必須將這個圖片從通訊軟體伺服器下載成為 binary 檔案後,再發送至 recognizeText WebAPI進行解析。理所當然,中間仍有些小細節仍需要注意,我們將逐一介紹。


程式

Step 1. 開啟 Bot Template 新專案,並開啟 RootDialog.cs,先加上於Azure 啟用 Computer Vision API 成功後,所記錄下來的 key

private string key = "your_key";

Step 2. 將我們在前一篇文章,透過postman 測試成功回傳的 json 檔案,貼上json2csharp,產生對應的物件,貼在 RootDialog.cs 內:

註: 本篇為快速說明如何介接,建議實作另建立資料存放 response 的 DTO 程式

JSON 檔案

{
        "status": "Succeeded",
        "recognitionResult": {
            "lines": [
                {
                    "boundingBox": [
                        13,
                        83,
                        152,
                        82,
                        153,
                        107,
                        14,
                        108
                    ],
                    "text": "HELLO",
                    "words": [
                        {
                            "boundingBox": [
                                21,
                                80,
                                156,
                                83,
                                149,
                                112,
                                13,
                                109
                            ],
                            "text": "HELLO"
                        }
                    ]
                },
                {
                    "boundingBox": [
                        100,
                        146,
                        174,
                        144,
                        175,
                        166,
                        101,
                        168
                    ],
                    "text": "GUYS",
                    "words": [
                        {
                            "boundingBox": [
                                102,
                                146,
                                177,
                                146,
                                178,
                                169,
                                102,
                                169
                            ],
                            "text": "GUYS"
                        }
                    ]
                }
            ]
        }
    }

https://d2mxuefqeaa7sj.cloudfront.net/s_C8D520B137E92BEF2298F7A94BC31FCFB451D16176D367CCC83252AC233096B6_1513963918575_image.png

你的 RootDialog.cs 應該呈現如下:

using System;
using System.Threading.Tasks;
using Microsoft.Bot.Builder.Dialogs;
using Microsoft.Bot.Connector;
using RestSharp;
using System.Web;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http.Headers;
using System.Net.Http;

namespace ComputerVisionBotApplication.Dialogs
{
    [Serializable]
    public class RootDialog : IDialog<object>
    {
        public Task StartAsync(IDialogContext context)
        {
            context.Wait(MessageReceivedAsync);

            return Task.CompletedTask;
        }

        private async Task MessageReceivedAsync(IDialogContext context, IAwaitable<object> result)
        {
           var activity = await result as Activity;

            // calculate something for us to return
            int length = (activity.Text ?? string.Empty).Length;

            // return our reply to the user
            await context.PostAsync($"You sent {activity.Text} which was {length} characters");

            context.Wait(MessageReceivedAsync);
        }

        private string key = "your_key";
    }

    public class Word
    {
        public List<int> boundingBox { get; set; }
        public string text { get; set; }
    }

    public class Line
    {
        public List<int> boundingBox { get; set; }
        public string text { get; set; }
        public List<Word> words { get; set; }
    }

    public class RecognitionResult
    {
        public List<Line> lines { get; set; }
    }

    public class RootObject
    {
        public string status { get; set; }
        public RecognitionResult recognitionResult { get; set; }
    }
}

Step 3. 通訊軟體傳遞資訊給機器人的時候,會有 Text 與 Attachments 兩種資料型態。你能發現他是一個Ilist,可以包含多個Attachment物件。Attachment 比較重要的內容是 ContentUrl

public IList<Attachment> Attachments { get; set; }


public class Attachment : IEquatable<Attachment>
{
    public Attachment();
    public Attachment(string contentType = null, string contentUrl = null, object content = null, string name = null, string thumbnailUrl = null);


    [JsonProperty(PropertyName = "contentType")]
    public string ContentType { get; set; }

    [JsonProperty(PropertyName = "contentUrl")]
    public string ContentUrl { get; set; }

    [JsonProperty(PropertyName = "content")]
    public object Content { get; set; }

    [JsonProperty(PropertyName = "name")]
    public string Name { get; set; }

    [JsonProperty(PropertyName = "thumbnailUrl")]
    public string ThumbnailUrl { get; set; }

    [JsonExtensionData(ReadData = true, WriteData = true)]
    public JObject Properties { get; set; }

    public bool Equals(Attachment other);
    public override bool Equals(object other);
    public override int GetHashCode();
}

Step 4. 因為需要使用附件內容,我們簡單判斷附件內容是否為空

註 1:Skype 直接貼上圖片送出的時候有可能沒有 Text,若您先前的程式是依據 Text 判斷執行不同動作,請記得判斷 Text 是否為 null
註 2: 附件可能不是圖片,實作機器人的時候需要多一點防呆判斷

if(activity.Attachments != null)
{

}

Step 5. 因為需要從 Skype/Microsoft teams 伺服器將檔案下載下來並進行處理,我們在MessageReceivedAsync方法內加入下列程式,使用 Token 呼叫相關下載圖並轉為 byte[] 物件。

string response = string.Empty;
if(activity.Attachments != null)
{
    using (HttpClient httpClient = new HttpClient())
    {
        var attachment = activity.Attachments.First();
        if ((activity.ChannelId.Equals("skype", StringComparison.InvariantCultureIgnoreCase) || activity.ChannelId.Equals("msteams", StringComparison.InvariantCultureIgnoreCase))
            && new Uri(attachment.ContentUrl).Host.EndsWith("skype.com"))
        {
            var token = await new MicrosoftAppCredentials().GetTokenAsync();
            httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
        }

        var responseMessage = await httpClient.GetAsync(attachment.ContentUrl);
        var data = await responseMessage.Content.ReadAsByteArrayAsync();
    }
}
await context.PostAsync($"{response}");
context.Wait(MessageReceivedAsync);

Step 6. 我們加入 RecognizeText 私有方法與 TextOperations 私有方法。RecognizeText方法主要傳入byte[] 資料,並且呼叫 RecognizeText WebAPI ,從 Header “Operation-Location”中取得網址;TextOperations 方法主要使用 RecognizeText 方法中取得的網址,直接呼叫並取得圖片文字資訊。

註 1:請記得安裝 restsharp
註 2:請記得在 Header 加入 "Ocp-Apim-Subscription-Key" : “key”

private async Task<string> RecognizeText(byte[] file)
{
    var uri = "/vision/v1.0/recognizeText?handwriting=true";

    var result = string.Empty;
    var client = new RestClient("https://southeastasia.api.cognitive.microsoft.com");
    var request = new RestRequest(uri, Method.POST);
    request.AddHeader("Ocp-Apim-Subscription-Key", key);
    request.AddParameter("application/octet-stream", file, ParameterType.RequestBody);
    var response = await client.ExecuteTaskAsync(request);

    if (response.IsSuccessful)
    {
        result = await TextOperations(response.Headers[1].Value.ToString());
    }
    return result;
}
private async Task<string> TextOperations(string operationsUrl)
{
    var result = string.Empty;
    var client = new RestClient(operationsUrl);
    var request = new RestRequest("", Method.GET);
    request.AddHeader("Ocp-Apim-Subscription-Key", key);
    var response = await client.ExecuteTaskAsync<RootObject>(request);

    return response.Data.recognitionResult.lines.Aggregate(result, (current, item) => current + (" " + item.text));
}

Step 7. 回到MessageReceivedAsync方法(我們從 Skype / MS Teams 下載資料的程式區塊),呼叫 RecognizeText(byte[] file) 方法,將你下載的 byte[] 物件帶入。你的程式碼應該呈現如下:

string response = string.Empty;
if(activity.Attachments != null)
{
    using (HttpClient httpClient = new HttpClient())
    {
        var attachment = activity.Attachments.First();
        if ((activity.ChannelId.Equals("skype", StringComparison.InvariantCultureIgnoreCase) || activity.ChannelId.Equals("msteams", StringComparison.InvariantCultureIgnoreCase))
            && new Uri(attachment.ContentUrl).Host.EndsWith("skype.com"))
        {
            var token = await new MicrosoftAppCredentials().GetTokenAsync();
            httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
        }

        var responseMessage = await httpClient.GetAsync(attachment.ContentUrl);
        var data = await responseMessage.Content.ReadAsByteArrayAsync();
        response = await RecognizeText(data);  
    }
}
await context.PostAsync($"{response}");
context.Wait(MessageReceivedAsync);

Step 8. 開啟模擬器近期測試,完成!
https://d2mxuefqeaa7sj.cloudfront.net/s_C8D520B137E92BEF2298F7A94BC31FCFB451D16176D367CCC83252AC233096B6_1513964400171_image.png


範例

https://github.com/matsurigoto/ComputerVisionBotExample.git


烏.... 寫程式花了不少時間…./images/emoticon/emoticon14.gif


上一篇
17.Cognitive Service - 圖片文字辨識
下一篇
19.Cognitive Service - Bing 拼字檢查服務
系列文
利用 MS Bot framework 與 Cognitive Service 建構自用智慧小秘書31
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言