Kotlin Android 第29天，從 0 到 ML - TensorFlow Lite - 藝術風格轉換(Style Transfer)

13th鐵人賽

kevin_chiu

團隊Kotlin 愛台灣 2021

2021-10-04 02:41:52

2132 瀏覽

分享至

前言:

(維基)神經風格遷移( NST ) 是指一類軟件算法，它們操縱數字圖像或視頻，以採用另一幅圖像的外觀或視覺風格。

大綱

藝術風格轉移模型由兩個子模型組成：
風格預測模型：一個基於 MobilenetV2 的神經網絡，將輸入風格圖像轉換為 100 維風格瓶頸向量。
風格轉換模型：一種神經網絡，將風格瓶頸向量應用於內容圖像並創建風格化圖像。
如果您的應用程序只需要支持一組固定的樣式圖像，您可以提前計算它們的樣式瓶頸向量，並從應用程序的二進製文件中排除樣式預測模型。

將 TensorFlow Lite 模型添加到assets文件夾

//cpu
"style_predict_quantized_256.tflite"
"style_transfer_quantized_384.tflite"
//gpu
"style_predict_f16_256.tflite"
"style_transfer_f16_384.tflite"

build.gradle(app)

dependencies {
   implementation 'org.tensorflow:tensorflow-lite:2.5.0'
   implementation 'org.tensorflow:tensorflow-lite-gpu:2.5.0'
 }

將照片和要轉換風格的圖傳給模組轉換

class StyleTransferModelExecutor(
…
fun execute(
contentImagePath: String,
styleImageName: String,
context: Context
  ): ModelExecutionResult {
try {
  Log.i(TAG, "running models")

  fullExecutionTime = SystemClock.uptimeMillis()
  preProcessTime = SystemClock.uptimeMillis()

  val contentImage = ImageUtils.decodeBitmap(File(contentImagePath))
  val contentArray =
    ImageUtils.bitmapToByteBuffer(contentImage, CONTENT_IMAGE_SIZE, CONTENT_IMAGE_SIZE)
  val styleBitmap =
    ImageUtils.loadBitmapFromResources(context, "thumbnails/$styleImageName")
  val input = ImageUtils.bitmapToByteBuffer(styleBitmap, STYLE_IMAGE_SIZE, STYLE_IMAGE_SIZE)

  val inputsForPredict = arrayOf<Any>(input)
  val outputsForPredict = HashMap<Int, Any>()
  val styleBottleneck = Array(1) { Array(1) { Array(1) { FloatArray(BOTTLENECK_SIZE) } } }
  outputsForPredict[0] = styleBottleneck
  preProcessTime = SystemClock.uptimeMillis() - preProcessTime

  stylePredictTime = SystemClock.uptimeMillis()
  // The results of this inference could be reused given the style does not change
  // That would be a good practice in case this was applied to a video stream.

  //輸入風格圖像轉換為 100 維風格瓶頸向量。
  interpreterPredict.runForMultipleInputsOutputs(inputsForPredict, outputsForPredict)
  stylePredictTime = SystemClock.uptimeMillis() - stylePredictTime
  Log.d(TAG, "Style Predict Time to run: $stylePredictTime")

  val inputsForStyleTransfer = arrayOf(contentArray, styleBottleneck)
  val outputsForStyleTransfer = HashMap<Int, Any>()
  val outputImage =
    Array(1) { Array(CONTENT_IMAGE_SIZE) { Array(CONTENT_IMAGE_SIZE) { FloatArray(3) } } }
  outputsForStyleTransfer[0] = outputImage

  styleTransferTime = SystemClock.uptimeMillis()

  //將照片和預測轉換好的向量作轉換
  interpreterTransform.runForMultipleInputsOutputs(
    inputsForStyleTransfer,
    outputsForStyleTransfer
  )
  styleTransferTime = SystemClock.uptimeMillis() - styleTransferTime
  Log.d(TAG, "Style apply Time to run: $styleTransferTime")

  postProcessTime = SystemClock.uptimeMillis()
  var styledImage =
    ImageUtils.convertArrayToBitmap(outputImage, CONTENT_IMAGE_SIZE, CONTENT_IMAGE_SIZE)
  postProcessTime = SystemClock.uptimeMillis() - postProcessTime

  fullExecutionTime = SystemClock.uptimeMillis() - fullExecutionTime
  Log.d(TAG, "Time to run everything: $fullExecutionTime")

  //回傳執行結果的圖
  return ModelExecutionResult(
    styledImage,
    preProcessTime,
    stylePredictTime,
    styleTransferTime,
    postProcessTime,
    fullExecutionTime,
    formatExecutionLog()
  )
} catch (e: Exception) {
  val exceptionLog = "something went wrong: ${e.message}"
  Log.d(TAG, exceptionLog)

  val emptyBitmap =
    ImageUtils.createEmptyBitmap(
      CONTENT_IMAGE_SIZE,
      CONTENT_IMAGE_SIZE
    )
  return ModelExecutionResult(
    emptyBitmap, errorMessage = e.message!!
  )
 }
}
…..

執行結果: