18.應用: 自用圖片文字辨識機器人

2018鐵人賽 cognitive service botframework computer vision api

Duran Hsieh

2017-12-21 21:35:05

1724 瀏覽

分享至

前言

看完了上一篇介紹與程式撰寫的部分所要處理的內容，了解需要分別呼叫 recognizeText 與 textOperation，才得以取的圖片中文字內容，如下圖所示：

而實際在實作圖片文字辨識機器人程式的過程中，需要額外處理一個步驟：因為透過通訊軟體傳遞圖片，我們必須將這個圖片從通訊軟體伺服器下載成為 binary 檔案後，再發送至 recognizeText WebAPI進行解析。理所當然，中間仍有些小細節仍需要注意，我們將逐一介紹。

程式

Step 1. 開啟 Bot Template 新專案，並開啟 RootDialog.cs，先加上於Azure 啟用 Computer Vision API 成功後，所記錄下來的 key

private string key = "your_key";

Step 2. 將我們在前一篇文章，透過postman 測試成功回傳的 json 檔案，貼上json2csharp，產生對應的物件，貼在 RootDialog.cs 內：

註：本篇為快速說明如何介接，建議實作另建立資料存放 response 的 DTO 程式

JSON 檔案

{
        "status": "Succeeded",
        "recognitionResult": {
            "lines": [
                {
                    "boundingBox": [
                        13,
                        83,
                        152,
                        82,
                        153,
                        107,
                        14,
                        108
                    ],
                    "text": "HELLO",
                    "words": [
                        {
                            "boundingBox": [
                                21,
                                80,
                                156,
                                83,
                                149,
                                112,
                                13,
                                109
                            ],
                            "text": "HELLO"
                        }
                    ]
                },
                {
                    "boundingBox": [
                        100,
                        146,
                        174,
                        144,
                        175,
                        166,
                        101,
                        168
                    ],
                    "text": "GUYS",
                    "words": [
                        {
                            "boundingBox": [
                                102,
                                146,
                                177,
                                146,
                                178,
                                169,
                                102,
                                169
                            ],
                            "text": "GUYS"
                        }
                    ]
                }
            ]
        }
    }

你的 RootDialog.cs 應該呈現如下：

using System;
using System.Threading.Tasks;
using Microsoft.Bot.Builder.Dialogs;
using Microsoft.Bot.Connector;
using RestSharp;
using System.Web;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http.Headers;
using System.Net.Http;

namespace ComputerVisionBotApplication.Dialogs
{
    [Serializable]
    public class RootDialog : IDialog<object>
    {
        public Task StartAsync(IDialogContext context)
        {
            context.Wait(MessageReceivedAsync);

            return Task.CompletedTask;
        }

        private async Task MessageReceivedAsync(IDialogContext context, IAwaitable<object> result)
        {
           var activity = await result as Activity;

            // calculate something for us to return
            int length = (activity.Text ?? string.Empty).Length;

            // return our reply to the user
            await context.PostAsync($"You sent {activity.Text} which was {length} characters");

            context.Wait(MessageReceivedAsync);
        }

        private string key = "your_key";
    }

    public class Word
    {
        public List<int> boundingBox { get; set; }
        public string text { get; set; }
    }

    public class Line
    {
        public List<int> boundingBox { get; set; }
        public string text { get; set; }
        public List<Word> words { get; set; }
    }

    public class RecognitionResult
    {
        public List<Line> lines { get; set; }
    }

    public class RootObject
    {
        public string status { get; set; }
        public RecognitionResult recognitionResult { get; set; }
    }
}

Step 3. 通訊軟體傳遞資訊給機器人的時候，會有 Text 與 Attachments 兩種資料型態。你能發現他是一個Ilist，可以包含多個Attachment物件。Attachment 比較重要的內容是 ContentUrl

public IList<Attachment> Attachments { get; set; }


public class Attachment : IEquatable<Attachment>
{
    public Attachment();
    public Attachment(string contentType = null, string contentUrl = null, object content = null, string name = null, string thumbnailUrl = null);


    [JsonProperty(PropertyName = "contentType")]
    public string ContentType { get; set; }

    [JsonProperty(PropertyName = "contentUrl")]
    public string ContentUrl { get; set; }

    [JsonProperty(PropertyName = "content")]
    public object Content { get; set; }

    [JsonProperty(PropertyName = "name")]
    public string Name { get; set; }

    [JsonProperty(PropertyName = "thumbnailUrl")]
    public string ThumbnailUrl { get; set; }

    [JsonExtensionData(ReadData = true, WriteData = true)]
    public JObject Properties { get; set; }

    public bool Equals(Attachment other);
    public override bool Equals(object other);
    public override int GetHashCode();
}

Step 4. 因為需要使用附件內容，我們簡單判斷附件內容是否為空

註 1：Skype 直接貼上圖片送出的時候有可能沒有 Text，若您先前的程式是依據 Text 判斷執行不同動作，請記得判斷 Text 是否為 null
註 2：附件可能不是圖片，實作機器人的時候需要多一點防呆判斷

if(activity.Attachments != null)
{

}

Step 5. 因為需要從 Skype/Microsoft teams 伺服器將檔案下載下來並進行處理，我們在MessageReceivedAsync方法內加入下列程式，使用 Token 呼叫相關下載圖並轉為 byte[] 物件。

string response = string.Empty;
if(activity.Attachments != null)
{
    using (HttpClient httpClient = new HttpClient())
    {
        var attachment = activity.Attachments.First();
        if ((activity.ChannelId.Equals("skype", StringComparison.InvariantCultureIgnoreCase) || activity.ChannelId.Equals("msteams", StringComparison.InvariantCultureIgnoreCase))
            && new Uri(attachment.ContentUrl).Host.EndsWith("skype.com"))
        {
            var token = await new MicrosoftAppCredentials().GetTokenAsync();
            httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
        }

        var responseMessage = await httpClient.GetAsync(attachment.ContentUrl);
        var data = await responseMessage.Content.ReadAsByteArrayAsync();
    }
}
await context.PostAsync($"{response}");
context.Wait(MessageReceivedAsync);

Step 6. 我們加入 RecognizeText 私有方法與 TextOperations 私有方法。RecognizeText方法主要傳入byte[] 資料，並且呼叫 RecognizeText WebAPI ，從 Header “Operation-Location”中取得網址；TextOperations 方法主要使用 RecognizeText 方法中取得的網址，直接呼叫並取得圖片文字資訊。

註 1：請記得安裝 restsharp
註 2：請記得在 Header 加入 "Ocp-Apim-Subscription-Key" : “key”

private async Task<string> RecognizeText(byte[] file)
{
    var uri = "/vision/v1.0/recognizeText?handwriting=true";

    var result = string.Empty;
    var client = new RestClient("https://southeastasia.api.cognitive.microsoft.com");
    var request = new RestRequest(uri, Method.POST);
    request.AddHeader("Ocp-Apim-Subscription-Key", key);
    request.AddParameter("application/octet-stream", file, ParameterType.RequestBody);
    var response = await client.ExecuteTaskAsync(request);

    if (response.IsSuccessful)
    {
        result = await TextOperations(response.Headers[1].Value.ToString());
    }
    return result;
}

private async Task<string> TextOperations(string operationsUrl)
{
    var result = string.Empty;
    var client = new RestClient(operationsUrl);
    var request = new RestRequest("", Method.GET);
    request.AddHeader("Ocp-Apim-Subscription-Key", key);
    var response = await client.ExecuteTaskAsync<RootObject>(request);

    return response.Data.recognitionResult.lines.Aggregate(result, (current, item) => current + (" " + item.text));
}

Step 7. 回到MessageReceivedAsync方法(我們從 Skype / MS Teams 下載資料的程式區塊)，呼叫 RecognizeText(byte[] file) 方法，將你下載的 byte[] 物件帶入。你的程式碼應該呈現如下：

string response = string.Empty;
if(activity.Attachments != null)
{
    using (HttpClient httpClient = new HttpClient())
    {
        var attachment = activity.Attachments.First();
        if ((activity.ChannelId.Equals("skype", StringComparison.InvariantCultureIgnoreCase) || activity.ChannelId.Equals("msteams", StringComparison.InvariantCultureIgnoreCase))
            && new Uri(attachment.ContentUrl).Host.EndsWith("skype.com"))
        {
            var token = await new MicrosoftAppCredentials().GetTokenAsync();
            httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
        }

        var responseMessage = await httpClient.GetAsync(attachment.ContentUrl);
        var data = await responseMessage.Content.ReadAsByteArrayAsync();
        response = await RecognizeText(data);  
    }
}
await context.PostAsync($"{response}");
context.Wait(MessageReceivedAsync);

Step 8. 開啟模擬器近期測試，完成！