iT邦幫忙

第 12 屆 iThome 鐵人賽

DAY 4
0

Finally~ we proceed to the 2nd lesson of Machine Learning APIs!!!

I know previous lessons are kinda boring. /images/emoticon/emoticon11.gif

Now we will jump into more intereting practical stuffs.

We will start with the Cloud Vision API's text detection method to make use of Optical Character Recognition (OCR) to extract text from images.

Then we will learn how to translate that text with the Translation API and analyze it with the Natural Language API.

Sounds fun and strange right ?!

Waiting no more. Let's start!


  1. Open Google Cloud Platform ( follow the step in A Tour of Qwiklabs and Google Cloud )

  2. Activate Cloud Shell
    Cloud Shell is a virtual machine that is loaded with development tools.
    When you are connected, you are already authenticated, and the project is set to your PROJECTID.
    https://ithelp.ithome.com.tw/upload/images/20200917/20130054uts7W3OLz7.png

  3. Create an API Key
    Create an API key under APIs & services in Google Cloud Platform.
    Run the following in Cloud Shell, replacing <your_api_key> with the key you just copied.

export API_KEY=<YOUR_API_KEY>
  1. Create a Cloud Storage bucket
    Go Storage in Google Cloud Platform and give your bucket a globally unique name.

  2. Upload an image to your bucket
    We use this image as a sample.
    https://ithelp.ithome.com.tw/upload/images/20200917/20130054K2msZNNBFJ.jpg

Once this image is uploaded, set the permission to PUBLIC. So we'll now see that the file has public access.

  1. Create your Vision API request
    Create an ocr-request.json file and add the following code:
{
  "requests": [
      {
        "image": {
          "source": {
              "gcsImageUri": "gs://my-bucket-name/sign.jpg"
          }
        },
        "features": [
          {
            "type": "TEXT_DETECTION",
            "maxResults": 10
          }
        ]
      }
  ]
}

We're going to use the TEXT_DETECTION feature of the Vision API. This will run optical character recognition (OCR) on the image to extract text.

  1. Call the Vision API's text detection method
    Call the Vision API with curl:
curl -s -X POST -H "Content-Type: application/json" --data-binary @ocr-request.json  https://vision.googleapis.com/v1/images:annotate?key=${API_KEY}

The first part of the response should look like the following:
https://ithelp.ithome.com.tw/upload/images/20200917/20130054fFDsDbDRj1.png

textAnnotations is the entire block of text the API found in the image. This includes the language code (in this case fr for French), a string of the text, and a bounding box indicating where the text was found in our image.

Run the following curl command to save the response to an ocr-response.json file so it can be referenced later:

curl -s -X POST -H "Content-Type: application/json" --data-binary @ocr-request.json  https://vision.googleapis.com/v1/images:annotate?key=${API_KEY} -o ocr-response.json
  1. Send text from the image to the Translation API
    Create a translation-request.json file and add the following code:
{
  "q": "your_text_here", // will pass this string to translate
  "target": "en"
}

Extract the image text from the previous step and copy it into a new translation-request.json:

STR=$(jq .responses[0].textAnnotations[0].description ocr-response.json) && STR="${STR//\"}" && sed -i "s|your_text_here|$STR|g" translation-request.json

Call the Translation API and copy the response into translation-response.json file:

curl -s -X POST -H "Content-Type: application/json" --data-binary @translation-request.json https://translation.googleapis.com/language/translate/v2?key=${API_KEY} -o translation-response.json

Inspect the file with the Translation API response:
https://ithelp.ithome.com.tw/upload/images/20200917/201300541RjWuxvDLq.png

translatedText contains the resulting translation, and detectedSourceLanguage is fr, the ISO language code for French.

  1. Analyze the image's text with the Natural Language API
    The Natural Language API helps us understand text by extracting entities, analyzing sentiment and syntax, and classifying text into categories. Use the analyzeEntities method to see what entities the Natural Language API can find in the text from your image.

To set up the API request, create a nl-request.json file with the following:

{
  "document": {
    "type": "PLAIN_TEXT", // support PLAIN_TEXT or HTML
    "content": "your_text_here" // text to send to the Natural Language API for analysis
  },
  "encodingType": "UTF8" // tells the API which type of text encoding to use when processing the text
}

Copy the translated text into the content block of the Natural Language API request:

STR=$(jq .data.translations[0].translatedText  translation-response.json) && STR="${STR//\"}" && sed -i "s|your_text_here|$STR|g" nl-request.json

The nl-request.json file now contains the translated English text from the original image.

Call the analyzeEntities endpoint of the Natural Language API with this curl request:

curl "https://language.googleapis.com/v1/documents:analyzeEntities?key=${API_KEY}" \
  -s -X POST -H "Content-Type: application/json" --data-binary @nl-request.json

The following response you can see the entities the Natural Language API found:
https://ithelp.ithome.com.tw/upload/images/20200917/2013005492I07JyE1t.png

For entities that have a wikipedia page, the API provides metadata including the URL of that page along with the entity's mid. The mid is an ID that maps to this entity in Google's Knowledge Graph.

For all entities, the Natural Language API tells us the places it appeared in the text (mentions), the type of entity, and salience (a [0,1] range indicating how important the entity is to the text as a whole).


Ha~ another long article /images/emoticon/emoticon06.gif

But I think is more interesting than the previous one as it has more hands-on practice this time.

Hope you a fun time /images/emoticon/emoticon12.gif


上一篇
Introduction to APIs in Google
下一篇
Classify Text into Categories with the Natural Language API
系列文
Machine Learning Study Jam 202012
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言