將 AI Code Review 整進 CICD Day14

2024 iThome 鐵人賽

DAY 15

DevOps

將 AI Code Review 整進 CICD系列第 15 篇

16th鐵人賽

嗷嗷嗷

2024-08-29 15:05:27

226 瀏覽

分享至

這邊從 openai 取回 response 後的操作有點有趣，我們來看一下。

如果 response 沒有拿到，或是 obj 內的 choices 是空的，就丟出 error。我們從 openai 官網可以看到。就是model 給的回應，有可能有多個選擇

if response is None or len(response["choices"]) == 0:
    raise openai.APIError

A list of chat completion choices. Can be more than one if n is greater than 1.

  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "logprobs": null,
    "finish_reason": "stop"
  }]

假設有回應，則將回應放入 resp 。再來則是 finish_reason，可能是 stop(剛好就回完了) 也可能是length(長度太長)。不知道他會不會根據 finish_reason 來做調整，我們繼續往下看

finish_reason
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool, or function_call (deprecated) if the model called a function.

if response is None or len(response["choices"]) == 0:
    raise openai.APIError
else:
    resp = response["choices"][0]['message']['content']
    finish_reason = response["choices"][0]["finish_reason"]
    get_logger().debug(f"\nAI response:\n{resp}")

    # log the full response for debugging
    response_log = self.prepare_logs(response, system, user, resp, finish_reason)
    get_logger().debug("Full_response", artifact=response_log)

    # for CLI debugging
    if get_settings().config.verbosity_level >= 2:
        get_logger().info(f"\nAI response:\n{resp}")

return resp, finish_reason

看樣子沒有 XD finish_reason 後面沒有被引用

    async def _get_prediction(self, model: str) -> str:
        """
        Generate an AI prediction for the pull request review.

        Args:
            model: A string representing the AI model to be used for the prediction.

        Returns:
            A string representing the AI prediction for the pull request review.
        """
        variables = copy.deepcopy(self.vars)
        variables["diff"] = self.patches_diff  # update diff

        environment = Environment(undefined=StrictUndefined)
        system_prompt = environment.from_string(get_settings().pr_review_prompt.system).render(variables)
        user_prompt = environment.from_string(get_settings().pr_review_prompt.user).render(variables)

        response, finish_reason = await self.ai_handler.chat_completion(
            model=model,
            temperature=get_settings().config.temperature,
            system=system_prompt,
            user=user_prompt
        )

        return response

差不多 Trace 到結果了，我們開始回推回去，async def _get_prediction response, finish_reason = await self.ai_handler.chat_completion → _prepare_prediction self.prediction = await self._get_prediction(model) → PRReviewer run await retry_with_fallback_models(self._prepare_prediction) 到這邊我們可以設想 PRReviewer 的 self.prediction 有從 ai 拿到產出了，接下來應該就是 format 然後丟給 gitlab comment，我們來看 pr_review = self._prepare_pr_review()

  await retry_with_fallback_models(self._prepare_prediction)
  if not self.prediction:
      self.git_provider.remove_initial_comment()
      return None

  pr_review = self._prepare_pr_review()

可以看到他使用 load_yaml 來將我們剛剛的 prediction 拿出需要的資料，並指定需要的 key。後面提到的first_key 和 last_key是在解析失敗時會使用 try_fix_yaml 將 yaml 可能跑出的錯誤格式做修改。感覺這段是各種踩雷精華累積


def _prepare_pr_review(self) -> str:
    """
    Prepare the PR review by processing the AI prediction and generating a markdown-formatted text that summarizes
    the feedback.
    """
    first_key = 'review'
    last_key = 'security_concerns'
    data = load_yaml(self.prediction.strip(),
                     keys_fix_yaml=["estimated_effort_to_review_[1-5]:", "security_concerns:", "key_issues_to_review:",
                                    "relevant_file:", "relevant_line:", "suggestion:"],
                     first_key=first_key, last_key=last_key)
    github_action_output(data, 'review')