傳統的記帳方式常常讓人感到繁瑣。無論是手寫在筆記本上,還是在手機應用程式裡點擊、輸入,每一個步驟都消耗著寶貴的時間與精力。尤其在忙碌的生活中,我們可能因為趕時間、事情一多就忘了記錄,導致月底結算時,發現總是有幾筆帳目對不上,難以完整追蹤個人開支。
這就是我們 AI語音記帳助理 誕生的原因。我們的目標是 消除記帳過程中的所有摩擦,讓財務管理變得像說話一樣簡單、自然。透過語音輸入與人工智慧的結合,我們將繁雜的記帳步驟濃縮成一句話,讓您隨時隨地、輕鬆地管理每一筆花費,從此告別遺忘和手動輸入的困擾。
Please build an AI Voice Accounting Agent with the following core features:
## Core Features & Specifications
1. **Zero-Friction Voice Input:**
* **Activation:** The system must allow users to initiate audio recording instantly with a single tap (or a dedicated widget trigger).
* **Natural Speech Acceptance:** The AI must process free-form, conversational statements about expenses.
* *Example:* "I spent 135 dollars at Starbucks for coffee this morning." or "Paid 15,000 for rent last Tuesday."
2. **Intelligent AI Parsing & Structure:**
* The core AI engine must automatically extract and structure all critical fields from the voice input.
* **Required Fields:** **Amount** (numerical value and currency), **Category** (inferred or explicit), **Date** (resolved), and **Note/Merchant** (remaining descriptive text).
3. **Dynamic Date Interpretation:**
* The AI must accurately interpret and resolve relative or ambiguous time references into a precise `YYYY-MM-DD` format.
* *Supported References:* "Today," "Yesterday," "Last Week," "The 10th of this month," etc.
4. **Customizable Category Mapping:**
* The system uses context clues (e.g., "gas," "latte," "movie ticket") to infer the correct category.
* **User Control:** Users must be able to define, manage, and map their own list of custom expense categories (e.g., mapping "gym" to "Wellness").
5. **Post-Entry Review and Quick Edit:**
* After the AI parses the statement, the system displays the structured data for a final check.
* All parsed fields (Amount, Date, Category, Note) must be instantly editable before saving, allowing users to correct AI inference errors quickly.
## User Interface (UI) / User Experience (UX) Flow
- **Ready Screen:** A minimal screen dominated by a large, accessible **Microphone Icon** (the Call-to-Action) for instant recording.
- **Audio Capture:** Clear visual feedback (e.g., a pulsating ring) confirming the system is actively listening.
- **Transcript Display:** The raw voice-to-text transcript is shown in real-time for immediate user verification.
- **Confirmation Modal:** Upon end-of-speech, a modal displays the AI's structured output in easily readable fields (Date: [editable], Amount: [editable], etc.).
- **Action:** The modal features clear **"Confirm & Sync"** and **"Discard"** buttons.
- **Success Feedback:** A brief, non-intrusive notification (e.g., a green banner) confirms **"Sync Complete to Sheets."**
## Technical Requirements
1. The system is architected around advanced native audio models for low-latency, conversational NLU, such as **Gemini 2.5 Flash Native Audio Preview**.
2. Full support for **Traditional Chinese (ZH-TW)** and **English (EN)** in both voice input and UI/confirmation text.
整個頁面設計簡潔、乾淨,旨在減少任何可能的分心,讓使用者能快速、直觀地完成語音輸入。
27日
): AI 自動解析並轉換為精確日期。便當
、飲料與餅乾
、打滴
): AI 從語音中提取的關鍵字。100.00
、200.00
、1620.00
): 精準的數字與幣別(TWD
)。這個介面的關鍵在於 易於編輯 和 即時確認。使用者可以快速瀏覽 AI 的解析結果,若發現任何錯誤,可以直接點擊單一項目進行修改,確保資料的準確性。這個流程完美地實現了 「先說話,再編輯」 的高效體驗,將傳統記帳的繁瑣步驟簡化為一個無縫的流程。