【Day - 21】Function Calling實戰3 - 結合GPT和3D圖像動畫 - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

2023 iThome 鐵人賽

DAY 21

Mobile Development

Ionic結合ChatGPT - 30天打造AI英語口說導師APP系列第 21 篇

【Day - 21】Function Calling實戰3 - 結合GPT和3D圖像動畫

15th鐵人賽 ionic angular chatgpt function calling

momochenisme

2023-09-21 09:04:28

550 瀏覽

分享至

昨天，我們成功運用Function Calling讓GPT-3.5模型按照特定的格式輸出回覆，其中包括「語氣」、「語氣強調」，以及「需要強調的欄位」。今天是Function Calling的實戰第三天，我們將運用GPT模型生成的語調來控制3D動畫，從而創建一個更具動態和沉浸感的使用者體驗。

語氣狀態管理

我們在status.service.ts檔案中，使用styleStatusSubject$來追踪和管理語氣狀態，這與我們先前用於追踪狀態的方式相同：

.
.
//音訊播放狀態
private playingStatusSubject$ = new BehaviorSubject<boolean>(false);
//當前儲存的音訊檔案
private currentAudioSubject$ = new BehaviorSubject<Blob | undefined>(undefined);
//語氣狀態
private styleStatusSubject$ = new BehaviorSubject<AIStyle>('friendly');
.
.
.
get styleStatus$(): Observable<AIStyle> {
  return this.styleStatusSubject$.asObservable();
}
.
.
.

再新增一個setStyleStatus()方法，該方法提供外部元件和服務更改或更新語氣狀態：

public setStyleStatus(aiStyle: AIStyle) {
  this.styleStatusSubject$.next(aiStyle);
}

為了在播放音訊時同步切換3D動畫，我在Home主頁的home.page.ts中，建立一個textToSpeech()方法獨立調用Speech Service，並透過map Operator，確保在回傳時將語氣資訊一併包含，最後在Subscribe裡面透過setStyleStatus()方法修改語氣的狀態：

OnGetRecordingBase64Text(recordingBase64Data: RecordingData) {
  const requestData: AudioConvertRequestModel = {
    aacBase64Data: recordingBase64Data.value.recordDataBase64
  };
  //啟動讀取
  this.statusService.startLoading();
  //Audio Convert API
  this.http.post<AudioConvertResponseModel>('你的Web APP URL/AudioConvert/aac2m4a', requestData).pipe(
    //Whisper API
    switchMap(audioAPIResult => this.openAIService.whisperAPI(audioAPIResult.m4aBase64Data)),
    //Chat API
    switchMap(whisperAPIResult => this.openAIService.chatAPI(whisperAPIResult.text)),
    //Speech Service API
    switchMap(chatResult => this.textToSpeech(chatResult)),
    finalize(() => {
      //停止讀取
      this.statusService.stopLoading();
    })
  ).subscribe(result => {
    //當前GPT回覆的語氣狀態
    this.statusService.setStyleStatus(result.gptStyle);
    //播放音訊
    this.statusService.playAudio(result.audioFile)
  });
}

private textToSpeech(conversationData: ConversationDataModel) {
  return this.speechService.textToSpeech(conversationData).pipe(
    map(audioFileResult => ({
      audioFile: audioFileResult,
      gptStyle: conversationData.gptResponseTextStyle
    }))
  );
}

調整Robot3D元件

接下來，我在【Day - 5】建立的Robot3D元件進行了一些調整。首先，在robot3d.component.ts檔案中，添加一個用於儲存各種3D動畫的animationList陣列。然後，新增convertAnimationName()這個方法，該方法的功能是將3D動畫名稱映射到我們定義的三種語氣中，由於動畫是網路上下載的，因此我們必須自行匹配語氣和動畫：

//動畫清單
private animationList: {
  name: AIStyle;
  animationAction: THREE.AnimationAction;
}[] = [];
.
.
.
private convertAnimationName(animationName: string): AIStyle {
  if (animationName === 'IDLE') {
    return 'friendly';
  } else if (animationName === 'ATTACK') {
    return 'excited';
  } else if (animationName === 'RUN') {
    return 'cheerful';
  }
  return '';
}

在執行createGLTF3DModel()方法時，讀取3D模型中的所有動畫，並將其依序儲存到先前建立的animationList陣列中：

private createGLTF3DModel() {
  this.gltfLoader = new GLTFLoader();
  //使用GLTF讀取器加載3D模型
  this.gltfLoader.load('assets/robot3DModel/scene.gltf',
    (gltf: GLTF) => {
      //設定3D模型座標位置
      gltf.scene.position.set(1.5, -5, 0);
      //設定3D模型旋轉角度
      gltf.scene.rotation.y = Math.PI;
      //添加3D模型到場景中
      this.scene.add(gltf.scene);
      //管理3D模型的動畫
      this.mixer = new THREE.AnimationMixer(gltf.scene);
      gltf.animations.forEach((clip: THREE.AnimationClip) => {
        //儲存所有動畫
        this.animationList.push({
          name: this.convertAnimationName(clip.name),
          animationAction: this.mixer.clipAction(clip)
        });
        //先將名字為"IDLE"的動畫抓出來顯示
        if (clip.name === "IDLE") {
          this.animationAction = this.mixer.clipAction(clip);
          this.animationAction.play();
        }
      });
    }, function (xhr) {
      console.log((xhr.loaded / xhr.total * 100) + '% loaded');
    }, function (error) {
      console.log(error);
    }
  );
}

最後，我們訂閱styleStatus$的值，根據該值從animationList陣列中選擇和啟動相應的動畫。在開始新動畫之前，請務必先停止當前播放的動畫以確保動畫可以正常轉換。記得，當元件被銷毀時，我們需要手動取消這個訂閱以防止記憶體泄漏：

//解除訂閱用
private destroy$ = new Subject();

constructor(private statusService: StatusService) { }

ngOnInit(): void {
  //訂閱語氣
  this.statusService.styleStatus$.pipe(
    takeUntil(this.destroy$)
  ).subscribe(styleStatusResult => {
    if (this.animationAction) {
      //停止當前動畫
      this.animationAction.stop();
      //尋找動畫清單內的對應動畫
      this.animationAction = this.animationList.find(item => item.name === styleStatusResult)!.animationAction;
      //播放動畫
      this.animationAction.play();
    }
  });
}

ngOnDestroy(): void {
  //解除訂閱
  this.destroy$.next(null);
  this.destroy$.complete();
}

測試3D動畫切換

在實體機上的最終測試後，我們可以觀察到，隨著音訊的播放，3D模型從原本「站立」動畫（對應「friendly」語氣）順利轉換成「跑步」動畫（對應「cheerful」語氣）。這表明，憑藉GPT模型的幫助，我們可以根據當前的語境，即時切換至適當的3D動畫。

結語

今天，我們成功運用Function Calling產生結構化的資料，巧妙的將GPT模型的回應和3D動畫結合，創造出一個充滿趣味且高度互動的使用者體驗。不過很可惜的是，由於我使用的3D動畫是從網路上下載的，其動畫種類和數量受到限制，使得它們無法完美匹配到語音的三種語氣：「friendly」、「excited」和「cheerful」。如果每一種語氣都能夠有對應的動畫表情，這將使得整體的互動體驗更加真實和有趣哦！

Github專案程式碼：Ionic結合ChatGPT - Day21