【Day 11】產品 10：AI 簡報助理

2025 iThome 鐵人賽

DAY 11

生成式 AI

30天挑戰開發30種產品系列第 11 篇

17th鐵人賽

jackietung

2025-09-25 23:29:27

67 瀏覽

分享至

你是不是常覺得製作簡報很花時間？尤其當資料量龐大時，從密密麻麻的文件中抓重點、整理成有邏輯的投影片，總是耗費大量心力。AI 簡報助理 就是為了解決這個痛點而產生。

一、要解決什麼問題？

在這個資訊爆炸的時代，我們每天都會接觸到大量的報告、研究文件、會議紀錄等。當需要將這些內容轉換成簡報時，傳統方法往往效率低落。從手動閱讀、整理、歸納，到排版和美化，每一步都可能成為巨大的時間黑洞。我們的 AI 簡報助理 核心目標，就是要讓使用者從繁瑣的勞動中解放出來，將寶貴的時間用於思考更具創造性的內容或與觀眾互動，而不是被困在製作投影片的重複性工作中。

二、提示詞設計

Please build an AI-Powered Presentation Assistant, with the following core features:

## Core Features & Specifications
1. Input Formats: The system must accept and process a wide range of document types, including but not limited to:
- PDF (.pdf)
- Microsoft Word (.docx, .doc)
- Text files (.txt)
- Markdown files (.md)

2. AI-Powered Content Extraction: The AI model should intelligently analyze the uploaded document to identify and extract key components:
- Core Concepts: Main ideas and arguments.
- Key Data: Statistics, numbers, and figures.
- Headings & Subheadings: Structural elements to form a presentation outline.
- Images & Graphics: Figures and charts within the document.
- Tables: Structured data within the document.

3. Outline Creation: Automatically generate a logical presentation outline based on the document's structure and identified key concepts. This should be presented to the user for review before slide generation.

4. Slide Generation: Based on the approved outline, the system will automatically create individual slides. Each slide should:
- Contain a clear title derived from the outline.
- Feature bullet points summarizing the key information from the corresponding section of the source document.
- Intelligently place extracted images or tables on relevant slides.

5. Slide Layouts & Design:
- Automatic Layout Application: Apply appropriate slide layouts (e.g., Title Slide, Bullet Point Slide, Image & Text, Two-Column) based on the content.
- Template Selection: Offer a library of professional, pre-designed templates that the user can choose from before generation or apply afterward. The system should apply a default template if none is selected.

6. In-App Editor: Provide an intuitive, user-friendly editor for post-generation adjustments. Users should be able to:
- Edit text on any slide.
- Rearrange, delete, or add new slides.
- Insert new images, text boxes, or shapes.
- Change slide layouts or templates.

7. Content Refining: Implement a “Refine” or “Summarize” function that allows the user to click a text box and have the AI rephrase or shorten the content for better presentation clarity.

8. Export Formats: Allow users to export the final presentation into widely used formats:
- Microsoft PowerPoint (.pptx)
- Google Slides (direct integration or export to a compatible format)
- PDF (.pdf)
- Image files (.jpeg, .png) for individual slides

9. Cloud Integration: Enable seamless saving and access to major cloud storage services like Google Drive and Dropbox.

## User Interface (UI) / User Experience (UX) Flow
- Welcome Screen: A clear call-to-action to “Create a New Presentation.”
- File Upload: A simple drag-and-drop or browse interface for file selection.
- Content Analysis & Outline Preview: Display a progress bar during AI analysis. Once complete, show a preview of the generated outline with the ability to edit it before proceeding.
- Template Selection: A visual gallery of templates for the user to choose from.
- Slide Generation: A final progress screen as the slides are being created.
Editor View: The main workspace where the user can view, edit, and refine the generated presentation.
- Export/Save: Clearly labeled buttons for exporting or saving the file.

## Technical Requirements
1. LLM Model Integration
All AI agents can connect to the latest Large Language Models (LLMs).
Users only need to input their personal API key to enable and utilize these features.

2. Supported Models include:
- Gemini: gemini-2.5-flash, gemini-2.5-pro
- ChatGPT: GPT-5, GPT-4o, GPT-4o mini
- Grok: Grok 4, Grok 3
- Claude: Claude 4 Sonnect, Claude 3.7 Sonnect

3. Language setting can be switched in ZH-TW or EN

三、產品原型呈現

1. 設定與文件上傳

這個畫面是產品的起始頁面，主要功能是讓使用者進行設定和上傳文件。

AI 模型設定區：
- 使用者可以選擇不同的 AI 供應商與模型，例如圖中的 Gemini 和 Gemini-2.5-flash，這提供了客製化的彈性，讓使用者能根據需求選擇合適的模型來生成簡報。
- 使用者需要輸入自己的 Gemini API 金鑰，這代表產品本身並非提供模型服務，而是作為一個應用程式層，利用使用者自己的金鑰來呼叫外部的 AI 服務，這有助於降低營運成本，同時讓使用者對 API 使用量有更直接的控制。
檔案上傳區：
- 中央的雲朵圖示與文字「將檔案拖曳至此處」顯示這是一個直覺式的檔案上傳介面。
- 支援多種常見的文件格式，例如 .txt, .md, .pdf, .docx，這涵蓋了大多數簡報製作的資料來源。
主體與系統設定區：
- 使用者可以輸入簡報的主題，這有助於 AI 更精準地理解簡報的核心內容。
- 系統的下拉選單可能讓使用者設定簡報的語言、風格或特定格式要求，例如「正式」、「輕鬆」或「技術導向」。

2. 審查與編輯大綱

完成檔案上傳後，產品會進入第二階段，讓使用者審核與調整 AI 自動生成的大綱。

自動生成大綱：
- AI 根據上傳的檔案內容，自動提煉出簡報的投影片標題 (Slide Title) 與內容要點 (Content Bullet points)。這解決了使用者從冗長文件中提煉精華的痛點。
- 圖中顯示了「AI 代理 (Agents)」、「引論」、「什麼是代理？」等自動生成的標題和內容，這些都是根據原始文件內容所產生。
編輯功能：
- 每個投影片的標題和內容都是可編輯的，讓使用者在生成簡報之前，可以先確認內容的準確性並進行微調。例如，可以調整標題、刪除不必要的段落或新增補充說明。
- 這個中間步驟非常重要，它確保了最終生成的簡報內容是符合使用者預期的，同時也保留了人為介入和客製化的彈性。

3. 視覺化簡報編輯器

在大綱確認後，產品會將內容轉換成視覺化的簡報，並提供編輯工具。

簡報預覽區：
- 左側是所有投影片的縮圖預覽，使用者可以快速瀏覽整份簡報的架構。
- 中間是目前正在編輯的投影片主畫面，例如圖中的第一頁，顯示了標題「AI 代理 (AI Agents)」和作者資訊。
編輯與調整工具：
- 上方導覽列提供了多種編輯選項，例如「主題」讓使用者更換簡報模板，「版面」讓使用者調整單張投影片的佈局。
- 右上角的「播放」按鈕則讓使用者可以預覽簡報的最終呈現效果。
導覽與切換：
- 使用者可以透過左側的縮圖或鍵盤方向鍵來快速切換不同的投影片。
- 底部的「新增投影片」按鈕則允許手動添加新的頁面。