看完了上一篇介紹與程式撰寫的部分所要處理的內容,了解需要分別呼叫 recognizeText 與 textOperation,才得以取的圖片中文字內容,如下圖所示:
而實際在實作圖片文字辨識機器人程式的過程中,需要額外處理一個步驟:因為透過通訊軟體傳遞圖片,我們必須將這個圖片從通訊軟體伺服器下載成為 binary 檔案後,再發送至 recognizeText WebAPI進行解析。理所當然,中間仍有些小細節仍需要注意,我們將逐一介紹。
Step 1. 開啟 Bot Template 新專案,並開啟 RootDialog.cs,先加上於Azure 啟用 Computer Vision API 成功後,所記錄下來的 key
private string key = "your_key";
Step 2. 將我們在前一篇文章,透過postman 測試成功回傳的 json 檔案,貼上json2csharp,產生對應的物件,貼在 RootDialog.cs 內:
註: 本篇為快速說明如何介接,建議實作另建立資料存放 response 的 DTO 程式
JSON 檔案
{
"status": "Succeeded",
"recognitionResult": {
"lines": [
{
"boundingBox": [
13,
83,
152,
82,
153,
107,
14,
108
],
"text": "HELLO",
"words": [
{
"boundingBox": [
21,
80,
156,
83,
149,
112,
13,
109
],
"text": "HELLO"
}
]
},
{
"boundingBox": [
100,
146,
174,
144,
175,
166,
101,
168
],
"text": "GUYS",
"words": [
{
"boundingBox": [
102,
146,
177,
146,
178,
169,
102,
169
],
"text": "GUYS"
}
]
}
]
}
}
你的 RootDialog.cs 應該呈現如下:
using System;
using System.Threading.Tasks;
using Microsoft.Bot.Builder.Dialogs;
using Microsoft.Bot.Connector;
using RestSharp;
using System.Web;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http.Headers;
using System.Net.Http;
namespace ComputerVisionBotApplication.Dialogs
{
[Serializable]
public class RootDialog : IDialog<object>
{
public Task StartAsync(IDialogContext context)
{
context.Wait(MessageReceivedAsync);
return Task.CompletedTask;
}
private async Task MessageReceivedAsync(IDialogContext context, IAwaitable<object> result)
{
var activity = await result as Activity;
// calculate something for us to return
int length = (activity.Text ?? string.Empty).Length;
// return our reply to the user
await context.PostAsync($"You sent {activity.Text} which was {length} characters");
context.Wait(MessageReceivedAsync);
}
private string key = "your_key";
}
public class Word
{
public List<int> boundingBox { get; set; }
public string text { get; set; }
}
public class Line
{
public List<int> boundingBox { get; set; }
public string text { get; set; }
public List<Word> words { get; set; }
}
public class RecognitionResult
{
public List<Line> lines { get; set; }
}
public class RootObject
{
public string status { get; set; }
public RecognitionResult recognitionResult { get; set; }
}
}
Step 3. 通訊軟體傳遞資訊給機器人的時候,會有 Text 與 Attachments 兩種資料型態。你能發現他是一個Ilist,可以包含多個Attachment物件。Attachment 比較重要的內容是 ContentUrl
public IList<Attachment> Attachments { get; set; }
public class Attachment : IEquatable<Attachment>
{
public Attachment();
public Attachment(string contentType = null, string contentUrl = null, object content = null, string name = null, string thumbnailUrl = null);
[JsonProperty(PropertyName = "contentType")]
public string ContentType { get; set; }
[JsonProperty(PropertyName = "contentUrl")]
public string ContentUrl { get; set; }
[JsonProperty(PropertyName = "content")]
public object Content { get; set; }
[JsonProperty(PropertyName = "name")]
public string Name { get; set; }
[JsonProperty(PropertyName = "thumbnailUrl")]
public string ThumbnailUrl { get; set; }
[JsonExtensionData(ReadData = true, WriteData = true)]
public JObject Properties { get; set; }
public bool Equals(Attachment other);
public override bool Equals(object other);
public override int GetHashCode();
}
Step 4. 因為需要使用附件內容,我們簡單判斷附件內容是否為空
註 1:Skype 直接貼上圖片送出的時候有可能沒有 Text,若您先前的程式是依據 Text 判斷執行不同動作,請記得判斷 Text 是否為 null
註 2: 附件可能不是圖片,實作機器人的時候需要多一點防呆判斷
if(activity.Attachments != null)
{
}
Step 5. 因為需要從 Skype/Microsoft teams 伺服器將檔案下載下來並進行處理,我們在MessageReceivedAsync方法內加入下列程式,使用 Token 呼叫相關下載圖並轉為 byte[] 物件。
string response = string.Empty;
if(activity.Attachments != null)
{
using (HttpClient httpClient = new HttpClient())
{
var attachment = activity.Attachments.First();
if ((activity.ChannelId.Equals("skype", StringComparison.InvariantCultureIgnoreCase) || activity.ChannelId.Equals("msteams", StringComparison.InvariantCultureIgnoreCase))
&& new Uri(attachment.ContentUrl).Host.EndsWith("skype.com"))
{
var token = await new MicrosoftAppCredentials().GetTokenAsync();
httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
}
var responseMessage = await httpClient.GetAsync(attachment.ContentUrl);
var data = await responseMessage.Content.ReadAsByteArrayAsync();
}
}
await context.PostAsync($"{response}");
context.Wait(MessageReceivedAsync);
Step 6. 我們加入 RecognizeText 私有方法與 TextOperations 私有方法。RecognizeText方法主要傳入byte[] 資料,並且呼叫 RecognizeText WebAPI ,從 Header “Operation-Location”中取得網址;TextOperations 方法主要使用 RecognizeText 方法中取得的網址,直接呼叫並取得圖片文字資訊。
註 1:請記得安裝 restsharp
註 2:請記得在 Header 加入 "Ocp-Apim-Subscription-Key" : “key”
private async Task<string> RecognizeText(byte[] file)
{
var uri = "/vision/v1.0/recognizeText?handwriting=true";
var result = string.Empty;
var client = new RestClient("https://southeastasia.api.cognitive.microsoft.com");
var request = new RestRequest(uri, Method.POST);
request.AddHeader("Ocp-Apim-Subscription-Key", key);
request.AddParameter("application/octet-stream", file, ParameterType.RequestBody);
var response = await client.ExecuteTaskAsync(request);
if (response.IsSuccessful)
{
result = await TextOperations(response.Headers[1].Value.ToString());
}
return result;
}
private async Task<string> TextOperations(string operationsUrl)
{
var result = string.Empty;
var client = new RestClient(operationsUrl);
var request = new RestRequest("", Method.GET);
request.AddHeader("Ocp-Apim-Subscription-Key", key);
var response = await client.ExecuteTaskAsync<RootObject>(request);
return response.Data.recognitionResult.lines.Aggregate(result, (current, item) => current + (" " + item.text));
}
Step 7. 回到MessageReceivedAsync方法(我們從 Skype / MS Teams 下載資料的程式區塊),呼叫 RecognizeText(byte[] file) 方法,將你下載的 byte[] 物件帶入。你的程式碼應該呈現如下:
string response = string.Empty;
if(activity.Attachments != null)
{
using (HttpClient httpClient = new HttpClient())
{
var attachment = activity.Attachments.First();
if ((activity.ChannelId.Equals("skype", StringComparison.InvariantCultureIgnoreCase) || activity.ChannelId.Equals("msteams", StringComparison.InvariantCultureIgnoreCase))
&& new Uri(attachment.ContentUrl).Host.EndsWith("skype.com"))
{
var token = await new MicrosoftAppCredentials().GetTokenAsync();
httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
}
var responseMessage = await httpClient.GetAsync(attachment.ContentUrl);
var data = await responseMessage.Content.ReadAsByteArrayAsync();
response = await RecognizeText(data);
}
}
await context.PostAsync($"{response}");
context.Wait(MessageReceivedAsync);
Step 8. 開啟模擬器近期測試,完成!
https://github.com/matsurigoto/ComputerVisionBotExample.git
烏.... 寫程式花了不少時間….