iT邦幫忙

2025 iThome 鐵人賽

DAY 29
0

企業內部知識庫 Copilot — LLM 全端專案腳手架(FastAPI + pgvector + RAG + 前端上傳/聊天)

一個可直接跑的企業知識庫專案:支援文件上傳、切塊索引(pgvector)、RAG 問答、來源引用、API Key 驗證、簡易權限(角色/部門)、Docker 一鍵啟動。

專案目錄

kb-copilot/
├─ backend/
│  ├─ app/
│  │  ├─ main.py
│  │  ├─ config.py
│  │  ├─ auth.py
│  │  ├─ db.py
│  │  ├─ models.py
│  │  ├─ schemas.py
│  │  ├─ services/
│  │  │  ├─ llm.py
│  │  │  ├─ embed.py
│  │  │  ├─ rag.py
│  │  │  ├─ chunker.py
│  │  │  └─ files.py
│  │  └─ routers/
│  │     ├─ chat.py
│  │     ├─ upload.py
│  │     └─ admin.py
│  ├─ tests/
│  │  └─ test_chat.py
│  ├─ requirements.txt
│  └─ Dockerfile
├─ frontend/
│  ├─ package.json
│  ├─ vite.config.ts
│  ├─ index.html
│  └─ src/
│     ├─ main.tsx
│     ├─ App.tsx
│     └─ Uploader.tsx
├─ postgres/
│  └─ init.sql
├─ data/
│  └─ uploads/           # 使用者上傳的原始檔
├─ .env.example
├─ docker-compose.yml
└─ README.md

後端(FastAPI)

backend/app/config.py

from pydantic import BaseSettings, Field
from functools import lru_cache

class Settings(BaseSettings):
    APP_NAME: str = "KB Copilot API"
    APP_VERSION: str = "0.1.0"
    ENV: str = "dev"

    # 安全
    API_KEY: str = Field("changeme", description="簡易 API Key")
    ALLOW_ORIGINS: str = "*"

    # DB
    DATABASE_URL: str = "postgresql+psycopg://postgres:postgres@db:5432/kb"

    # LLM/Embedding
    LLM_PROVIDER: str = "ollama"  # ollama | openai
    OPENAI_API_KEY: str | None = None
    OPENAI_MODEL: str = "gpt-4o-mini"

    OLLAMA_HOST: str = "http://ollama:11434"
    OLLAMA_MODEL: str = "qwen2.5:7b-instruct"

    EMBEDDING_MODEL: str = "BAAI/bge-small-zh-v1.5"  # sentence-transformers
    CHUNK_SIZE: int = 700
    CHUNK_OVERLAP: int = 80

    UPLOAD_DIR: str = "/app/data/uploads"

    class Config:
        env_file = ".env"

@lru_cache
def get_settings():
    return Settings()

backend/app/auth.py

from fastapi import Header, HTTPException, status
from .config import get_settings

settings = get_settings()

async def api_key_auth(x_api_key: str = Header(default="")):
    if x_api_key != settings.API_KEY:
        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid API key")

backend/app/db.py

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, declarative_base
from contextlib import contextmanager
from .config import get_settings

settings = get_settings()
engine = create_engine(settings.DATABASE_URL, pool_pre_ping=True)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base()

@contextmanager
def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

backend/app/models.py

from sqlalchemy import Column, Integer, String, Text, ForeignKey, DateTime, JSON
from sqlalchemy.dialects.postgresql import VECTOR
from sqlalchemy.sql import func
from sqlalchemy.orm import relationship
from .db import Base

class Document(Base):
    __tablename__ = "documents"
    id = Column(Integer, primary_key=True)
    filename = Column(String, index=True)
    mime = Column(String)
    uploader = Column(String)              # 使用者帳號/信箱
    department = Column(String, index=True) # 部門(用於權限)
    created_at = Column(DateTime, server_default=func.now())

class Chunk(Base):
    __tablename__ = "chunks"
    id = Column(Integer, primary_key=True)
    document_id = Column(Integer, ForeignKey("documents.id", ondelete="CASCADE"), index=True)
    text = Column(Text)
    meta = Column(JSON)
    embedding = Column(VECTOR(768))        # bge-small-zh 維度
    document = relationship("Document")

backend/app/schemas.py

from pydantic import BaseModel
from typing import List

class ChatMessage(BaseModel):
    role: str
    content: str

class ChatRequest(BaseModel):
    messages: List[ChatMessage]
    department: str = "public"  # 查詢者的部門(做行級過濾)

class Citation(BaseModel):
    source: str
    score: float
    snippet: str

class ChatResponse(BaseModel):
    answer: str
    citations: List[Citation]

backend/app/services/embed.py

from sentence_transformers import SentenceTransformer
from functools import lru_cache
from .config import get_settings

settings = get_settings()

@lru_cache
def get_embed_model():
    return SentenceTransformer(settings.EMBEDDING_MODEL)

def embed_texts(texts: list[str]):
    model = get_embed_model()
    return model.encode(texts, normalize_embeddings=True).tolist()

backend/app/services/chunker.py

import re
from .config import get_settings

settings = get_settings()

SEPS = ["\n\n", "\n", "。", "!", "?"]

def chunk_text(text: str) -> list[str]:
    parts = [text]
    for sep in SEPS:
        new_parts = []
        for t in parts:
            new_parts.extend([s.strip() for s in t.split(sep) if s.strip()])
        parts = new_parts
        if len(parts) > 3000:
            break
    chunks, buf = [], ""
    for s in parts:
        if len(buf) + len(s) + 1 <= settings.CHUNK_SIZE:
            buf = (buf + " " + s).strip()
        else:
            if buf:
                chunks.append(buf)
            buf = s
    if buf:
        chunks.append(buf)
    return chunks

backend/app/services/llm.py

from .config import get_settings
import requests

settings = get_settings()

class LLMClient:
    def chat(self, messages: list[dict]) -> str:
        if settings.LLM_PROVIDER == "openai":
            import openai
            client = openai.OpenAI(api_key=settings.OPENAI_API_KEY)
            resp = client.chat.completions.create(
                model=settings.OPENAI_MODEL,
                messages=messages,
                temperature=0.2,
            )
            return resp.choices[0].message.content
        # ollama
        r = requests.post(
            f"{settings.OLLAMA_HOST}/api/chat",
            json={"model": settings.OLLAMA_MODEL, "messages": messages, "stream": False},
            timeout=120,
        )
        r.raise_for_status()
        return r.json().get("message", {}).get("content", "")

backend/app/services/rag.py

from sqlalchemy import text
from sqlalchemy.orm import Session
from typing import List, Dict
from .config import get_settings
from .services.llm import LLMClient

settings = get_settings()

SYSTEM = (
    "你是企業知識庫助理。回答必須來自引用的文件;若缺乏足夠證據則說明查無並提出下一步建議。"
)

class KBSearch:
    def __init__(self, db: Session):
        self.db = db

    def search(self, query: str, department: str, top_k: int = 5):
        # 以 pgvector 進行向量相似度檢索,並做部門過濾
        sql = text(
            """
            WITH q AS (
              SELECT CAST(:emb AS vector(768)) AS e
            )
            SELECT c.id, c.text, d.filename, d.department,
                   1 - (c.embedding <=> (SELECT e FROM q)) AS score
            FROM chunks c
            JOIN documents d ON d.id = c.document_id
            WHERE d.department = :dept OR d.department = 'public'
            ORDER BY c.embedding <=> (SELECT e FROM q)
            LIMIT :k
            """
        )
        from .embed import embed_texts
        emb = embed_texts([query])[0]
        rows = self.db.execute(sql, {"emb": emb, "k": top_k, "dept": department}).mappings().all()
        return rows


def answer_with_rag(db: Session, llm: LLMClient, user_query: str, department: str) -> Dict:
    kb = KBSearch(db)
    hits = kb.search(user_query, department, top_k=6)
    context = "\n\n".join([f"[來源:{h['filename']}]\n{h['text']}" for h in hits])
    citations = [
        {"source": h["filename"], "score": float(h["score"]), "snippet": h["text"][:140]} for h in hits
    ]
    messages = [
        {"role": "system", "content": SYSTEM},
        {"role": "user", "content": f"問題:{user_query}\n\n參考內容:\n{context}\n\n請用繁體中文,條列清楚並附上來源檔名。"}
    ]
    out = llm.chat(messages)
    return {"answer": out, "citations": citations}

backend/app/services/files.py

from pathlib import Path
from sqlalchemy.orm import Session
from .models import Document, Chunk
from .services.embed import embed_texts
from .services.chunker import chunk_text
from .config import get_settings

settings = get_settings()

try:
    import fitz  # PyMuPDF
except Exception:
    fitz = None

def read_text(fp: Path) -> str:
    if fp.suffix.lower() in [".txt", ".md"]:
        return fp.read_text(encoding="utf-8", errors="ignore")
    if fp.suffix.lower() == ".pdf" and fitz:
        doc = fitz.open(str(fp))
        return "\n".join(page.get_text() for page in doc)
    return ""


def ingest_file(db: Session, file_path: Path, uploader: str, department: str):
    text = read_text(file_path)
    if not text:
        return 0
    doc = Document(filename=file_path.name, mime=file_path.suffix, uploader=uploader, department=department)
    db.add(doc)
    db.flush()
    chunks = chunk_text(text)
    embs = embed_texts(chunks)
    rows = []
    for t, e in zip(chunks, embs):
        rows.append(Chunk(document_id=doc.id, text=t, meta={"filename": file_path.name}, embedding=e))
    db.add_all(rows)
    db.commit()
    return len(rows)

backend/app/routers/upload.py

from fastapi import APIRouter, UploadFile, File, Form, Depends
from pathlib import Path
from ..auth import api_key_auth
from ..db import get_db
from sqlalchemy.orm import Session
from ..services.files import ingest_file
from ..config import get_settings

router = APIRouter(prefix="/upload", tags=["upload"], dependencies=[Depends(api_key_auth)])
settings = get_settings()

@router.post("")
async def upload(file: UploadFile = File(...), uploader: str = Form("anonymous"), department: str = Form("public"), db: Session = Depends(get_db)):
    dst = Path(settings.UPLOAD_DIR) / file.filename
    dst.parent.mkdir(parents=True, exist_ok=True)
    content = await file.read()
    dst.write_bytes(content)
    n = ingest_file(db, dst, uploader, department)
    return {"ok": True, "chunks": n}

backend/app/routers/chat.py

from fastapi import APIRouter, Depends
from sqlalchemy.orm import Session
from ..auth import api_key_auth
from ..db import get_db
from ..schemas import ChatRequest, ChatResponse, Citation
from ..services.llm import LLMClient
from ..services.rag import answer_with_rag

router = APIRouter(prefix="/chat", tags=["chat"], dependencies=[Depends(api_key_auth)])
_llm = LLMClient()

@router.post("/completion", response_model=ChatResponse)
def completion(payload: ChatRequest, db: Session = Depends(get_db)) -> ChatResponse:
    user_msg = next((m.content for m in payload.messages if m.role == "user"), "")
    out = answer_with_rag(db, _llm, user_msg, payload.department)
    return ChatResponse(answer=out["answer"], citations=[Citation(**c) for c in out["citations"]])

backend/app/routers/admin.py

from fastapi import APIRouter, Depends
from sqlalchemy.orm import Session
from ..auth import api_key_auth
from ..db import get_db
from ..models import Document, Chunk

router = APIRouter(prefix="/admin", tags=["admin"], dependencies=[Depends(api_key_auth)])

@router.get("/stats")
def stats(db: Session = Depends(get_db)):
    docs = db.query(Document).count()
    chunks = db.query(Chunk).count()
    return {"documents": docs, "chunks": chunks}

backend/app/main.py

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from .config import get_settings
from .db import Base, engine
from . import models
from .routers import chat, upload, admin

settings = get_settings()

app = FastAPI(title=settings.APP_NAME, version=settings.APP_VERSION)
app.add_middleware(
    CORSMiddleware,
    allow_origins=[o.strip() for o in settings.ALLOW_ORIGINS.split(",")],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

Base.metadata.create_all(bind=engine)

app.include_router(upload.router)
app.include_router(chat.router)
app.include_router(admin.router)

@app.get("/")
def root():
    return {"ok": True, "service": settings.APP_NAME}

backend/requirements.txt

fastapi>=0.111
uvicorn[standard]>=0.30
pydantic>=2.7
requests>=2.32
SQLAlchemy>=2.0
psycopg[binary,pool]>=3.2
sentence-transformers>=3.0
PyMuPDF>=1.24
openai>=1.35

backend/Dockerfile

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Postgres(pgvector)

postgres/init.sql

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE IF NOT EXISTS documents (
  id SERIAL PRIMARY KEY,
  filename TEXT,
  mime TEXT,
  uploader TEXT,
  department TEXT DEFAULT 'public',
  created_at TIMESTAMP DEFAULT now()
);

CREATE TABLE IF NOT EXISTS chunks (
  id SERIAL PRIMARY KEY,
  document_id INT REFERENCES documents(id) ON DELETE CASCADE,
  text TEXT,
  meta JSONB,
  embedding vector(768)
);

-- HNSW 索引(Postgres 16+ 搭配 pgvector >= 0.7 推薦)
CREATE INDEX IF NOT EXISTS idx_chunks_emb_hnsw ON chunks USING hnsw (embedding vector_ip_ops);
CREATE INDEX IF NOT EXISTS idx_docs_dept ON documents(department);

前端(React + Vite)

frontend/package.json

{
  "name": "kb-copilot-ui",
  "version": "0.1.0",
  "private": true,
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "vite build",
    "preview": "vite preview --port 5173"
  },
  "dependencies": {
    "react": "^18.2.0",
    "react-dom": "^18.2.0"
  },
  "devDependencies": {
    "@types/react": "^18.2.66",
    "@types/react-dom": "^18.2.22",
    "typescript": "^5.5.4",
    "vite": "^5.3.4"
  }
}

frontend/src/Uploader.tsx

import React, { useState } from 'react'

const API_BASE = import.meta.env.VITE_API_BASE || 'http://localhost:8000'
const API_KEY = import.meta.env.VITE_API_KEY || 'changeme'

export default function Uploader(){
  const [file, setFile] = useState<File|null>(null)
  const [dept, setDept] = useState('public')
  const [msg, setMsg] = useState('')

  const upload = async () => {
    if(!file) return
    const fd = new FormData()
    fd.append('file', file)
    fd.append('uploader', 'demo@company.com')
    fd.append('department', dept)
    const r = await fetch(`${API_BASE}/upload`, { method:'POST', headers:{ 'x-api-key': API_KEY }, body: fd})
    const data = await r.json()
    setMsg(JSON.stringify(data))
  }

  return (
    <div style={{border:'1px solid #ddd', padding:12, borderRadius:8}}>
      <h3>上傳文件並建索引</h3>
      <input type="file" onChange={e=>setFile(e.target.files?.[0]||null)} />
      <select value={dept} onChange={e=>setDept(e.target.value)} style={{marginLeft:8}}>
        <option value="public">public</option>
        <option value="sales">sales</option>
        <option value="ops">ops</option>
        <option value="hr">hr</option>
      </select>
      <button onClick={upload} style={{marginLeft:8}}>上傳</button>
      <div style={{marginTop:8, whiteSpace:'pre-wrap'}}>{msg}</div>
    </div>
  )
}

frontend/src/App.tsx

import React, { useState } from 'react'
import Uploader from './Uploader'

const API_BASE = import.meta.env.VITE_API_BASE || 'http://localhost:8000'
const API_KEY = import.meta.env.VITE_API_KEY || 'changeme'

type Msg = { role: 'user'|'assistant'|'system'; content: string }

type Citation = { source: string; score: number; snippet: string }

type ChatResp = { answer: string; citations: Citation[] }

export default function App(){
  const [messages, setMessages] = useState<Msg[]>([])
  const [department, setDepartment] = useState('public')
  const [input, setInput] = useState('')
  const [loading, setLoading] = useState(false)

  const send = async () => {
    if(!input.trim()) return
    const newMsgs = [...messages, { role: 'user', content: input }]
    setMessages(newMsgs)
    setInput('')
    setLoading(true)
    try{
      const r = await fetch(`${API_BASE}/chat/completion`,{
        method:'POST', headers:{'Content-Type':'application/json','x-api-key':API_KEY},
        body: JSON.stringify({messages: newMsgs, department})
      })
      const data: ChatResp = await r.json()
      setMessages([...newMsgs, { role: 'assistant', content: data.answer + '\n\n來源:\n' + data.citations.map(c=>`- ${c.source} (score=${c.score.toFixed(3)})`).join('\n') }])
    } finally { setLoading(false) }
  }

  return (
    <div style={{maxWidth: 900, margin:'40px auto', fontFamily:'system-ui'}}>
      <h1>企業知識庫 Copilot</h1>
      <div style={{display:'grid', gridTemplateColumns:'1fr', gap:16}}>
        <Uploader/>
        <div style={{border:'1px solid #ddd', borderRadius:12, padding:16, minHeight:380}}>
          {messages.length===0 && <p>輸入問題,例如:『依 HR 手冊,請說明加班申請流程與審批時限』</p>}
          {messages.map((m,i)=> (
            <div key={i} style={{margin:'12px 0'}}>
              <b>{m.role === 'user' ? '你' : 'AI'}</b>
              <div style={{whiteSpace:'pre-wrap'}}>{m.content}</div>
            </div>
          ))}
        </div>
        <div style={{display:'flex', gap:8}}>
          <input value={input} onChange={e=>setInput(e.target.value)} placeholder="輸入訊息..." style={{flex:1, padding:12, borderRadius:8, border:'1px solid #ccc'}}/>
          <select value={department} onChange={e=>setDepartment(e.target.value)}>
            <option value="public">public</option>
            <option value="sales">sales</option>
            <option value="ops">ops</option>
            <option value="hr">hr</option>
          </select>
          <button onClick={send} disabled={loading} style={{padding:'12px 16px', borderRadius:8}}>{loading? '思考中...' : '送出'}</button>
        </div>
      </div>
    </div>
  )
}

其他前端檔

frontend/index.html, frontend/src/main.tsx, frontend/vite.config.ts 與上一個專案相同(可直接複用)

Docker 與環境

.env.example

API_KEY=changeme
ALLOW_ORIGINS=*
DATABASE_URL=postgresql+psycopg://postgres:postgres@db:5432/kb
LLM_PROVIDER=ollama
OPENAI_API_KEY=
OPENAI_MODEL=gpt-4o-mini
OLLAMA_HOST=http://ollama:11434
OLLAMA_MODEL=qwen2.5:7b-instruct
EMBEDDING_MODEL=BAAI/bge-small-zh-v1.5
UPLOAD_DIR=/app/data/uploads

docker-compose.yml

version: "3.9"
services:
  db:
    image: pgvector/pgvector:pg16
    environment:
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_DB=kb
    ports: ["5433:5432"]
    volumes:
      - dbdata:/var/lib/postgresql/data
      - ./postgres/init.sql:/docker-entrypoint-initdb.d/init.sql
  api:
    build: ./backend
    env_file: .env
    volumes:
      - ./data:/app/data
    depends_on: [db]
    ports: ["8000:8000"]
  ui:
    image: node:20-alpine
    working_dir: /app
    volumes:
      - ./frontend:/app
    command: sh -c "npm install && npm run dev -- --host"
    environment:
      - VITE_API_BASE=http://localhost:8000
      - VITE_API_KEY=${API_KEY}
    ports: ["5173:5173"]
  ollama:
    image: ollama/ollama:0.3.14
    ports: ["11434:11434"]
    volumes:
      - ollama:/root/.ollama
volumes:
  dbdata: {}
  ollama: {}

README(精簡)

企業內部知識庫 Copilot(Demo)

啟動

  1. 準備環境
cp .env.example .env
# 若用 Ollama:先啟動服務,並拉取中文友善模型
docker compose up -d ollama
air -q || true
curl http://localhost:11434/api/pull -d '{"name":"qwen2.5:7b-instruct"}'
  1. 啟動所有服務
docker compose up -d db api ui
  1. 上傳文件建索引
curl -X POST -H "x-api-key: ${API_KEY}" -F "file=@docs/HR_手冊.pdf" -F "uploader=alice@corp" -F "department=hr" http://localhost:8000/upload
  1. 發問(RAG 問答)
  • 前端聊天或使用 API:
curl -X POST -H 'x-api-key: ${API_KEY}' -H 'Content-Type: application/json' \
  -d '{"messages":[{"role":"user","content":"依 HR 手冊,加班申請流程是什麼?"}], "department":"hr"}' \
  http://localhost:8000/chat/completion

評測建議(離線)

  • 蒐集 50~100 條內部 QA 當題庫(標註正解與來源)
  • 指標:回答正確率、引用正確率、拒答適當率
  • 可用工具:Ragas(Python),或以 pytest 編寫斷言測試

強化 Roadmap

  • ✅ P0:文件上傳→切塊→嵌入→RAG 問答(引用)
  • ⏩ P1:OAuth/SSO 與角色權限(部門/群組),審計日誌
  • ⏩ P2:增量索引(版本化)、文件刪除同步清理向量
  • ⏩ P3:觀測(Langfuse/OTel)與提示詞 A/B 測試
  • ⏩ P4:加入 SQL/BI 工具調用,實作「規範 + 數據」混合回答

上一篇
Day28-LLM專案
系列文
AI咒術迴戰~LLM絕對領域展開29
圖片
  熱門推薦
圖片
{{ item.channelVendor }} | {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言