因為主功能開發完了,逛了一下 Google Cloud 頁面看到 Vertex AI 有個 RAG Engine ,之前在用 DeepWiki 時有問過 LLM 有沒有辦法自己架,其中就涉及到使用 RAG 就能做到類似的功能,那麼就來看一下 Vertex AI 裡面的 RAG Engine 怎麼使用。
看了一下介紹 Vertex AI RAG Engine overview, RAG 可以讓 LLM 提供一些私有資料 (例如公司內部文件,非公開的程式碼等等) ,透過對預先對這些資料解析與儲存,讓 LLM 擷取這些資料來回答問題,不過目前支援的區域很少:
Region | Location | Description | Launch stage |
---|---|---|---|
us-central1 |
Iowa | v1 and v1beta1 versions are supported. |
Allowlist |
us-east4 |
Virginia | v1 and v1beta1 versions are supported. |
GA |
europe-west3 |
Frankfurt, Germany | v1 and v1beta1 versions are supported. |
GA |
europe-west4 |
Eemshaven, Netherlands | v1 and v1beta1 versions are supported. |
GA |
用 QuickStart 的 Python 範例跑跑看,一開始要先設定檔案的路徑,他不支援本地路徑的樣子,會要你輸入 Google 文件或是 Google Cloud Bucket 的網址,因為件專案已經有 Bucket 可以用,查了一下指令,把之前的 hello-cloud-run 專案丟上去:
gsutil -m cp -r ./hello-cloud-run/ gs:/xxxxx/hello
在範例程式裡面改好路徑後試著運行一下,結果:
contexts {
contexts {
source_uri: "gs://xxxxx/hello/CONTRIBUTING.md"
text: "How to become a contributor and submit your own code\n\n\nContributor License Agreements\nWe\'d love to accept your sample apps and patches! Before we can take them, we\nhave to jump a couple of legal hurdles.\nPlease fill out either the individual or corporate Contributor License Agreemen..."
source_display_name: "CONTRIBUTING.md"
score: 0.48237444562617748
chunk {
text: "How to become a contributor and submit your..."
}
}
}
I can provide information on how to become a contributor and submit your own code, including signing a Contributor License Agreement (CLA), submitting an issue describing your proposed change, forking the desired repo, developing and testing your code changes, ensuring your code adheres to the existing style and has an appropriate set of unit tests, and submitting a pull request.
看起來只吃到 CONTRIBUTING.md
這個檔案,多翻了一下文件 Document types for Vertex AI RAG Engine ,原來支援的類型就是一些文字檔案而已:
File type | File size limit |
---|---|
Google documents | 10 MB when exported from Google Workspace |
Google drawings | 10 MB when exported from Google Workspace |
Google slides | 10 MB when exported from Google Workspace |
HTML file | 10 MB |
JSON file | 10 MB |
JSONL or NDJSON file | 10 MB |
Markdown file | 10 MB |
Microsoft PowerPoint slides (PPTX file) | 10 MB |
Microsoft Word documents (DOCX file) | 50 MB |
PDF file | 50 MB |
Text file | 10 MB |
但程式碼也是文字檔案吧,改成只把 js
放進去並全部命名成 .txt
,然後問他這個服務開了哪些 API :
contexts {
contexts {
source_uri: "gs://xxxxx/hello/index.txt"
text: "...n/**\n * Listen for termination signal\n */\nprocess.on(\'SIGTERM\', () => {\n // Clean up resources on shutdown\n logger.info(\'Caught SIGTERM.\');\n logger.flush();\n});\n\nmain();"
source_display_name: "index.txt"
score: 0.4867848831211119
chunk {
text: "...\n * Listen for termination signal\n */\nprocess.on(\'SIGTERM\', () => {\n // Clean up resources on shutdown\n logger.info(\'Caught SIGTERM.\');\n logger.flush();\n});\n\nmain();"
}
}
contexts {
source_uri: "gs://xxxxx/hello/app.txt"
text: "...; // Example of structured logging\n // Use request-based logger with log correlation\n req.log.info(\'Child logger with trace Id.\'); // https://cloud.google.com/run/docs/logging#correlate-logs\n res.send(\'AAAAAAAAAAAAAAAAAAAAAA\');\n});\n\nexport default app;"
source_display_name: "app.txt"
score: 0.49215430310890951
chunk {
text: "... // Example of structured logging\n // Use request-based logger with log correlation\n req.log.info(\'Child logger with trace Id.\'); // https://cloud.google.com/run/docs/logging#correlate-logs\n res.send(\'AAAAAAAAAAAAAAAAAAAAAA\');\n});\n\nexport default app;"
}
}
}
The available APIs are:
* `/`
* `/hi`
* `/hihi`
測了幾次才發現這個範例程式會回兩種東西,最上面 contexts
開頭的是 LLM 依據問題去找可能相關的 context
,也就是我們一開始給的資料,將這些資料算出分數後傳回來。下面的則是依據這些資料給出對應的回覆。這樣我們就有了一個陽春版的 DeepWiki 了!
只是如果有些機密檔案不方便上傳到 Google 雲端的話就得想其他方法來建立整個流程,不過至少 Vertex AI RAG Engine 提供了一個簡易的方式來建立 RAG Engine,也是挺方便的。