iT邦幫忙

1

[Day 7]在 SQLite 做關鍵字搜尋(支援分頁與匯出)

  • 分享至 

  • xImage
  •  

在 SQLite 做關鍵字搜尋會得到什麼
用關鍵字查 text 與 url
支援分頁:--page、--size
排序:--order id_desc/id_asc
可輸出查詢結果到 CSV:--out result.csv

程式碼(存成 search_db.py)
把下面整段貼到 project/search_db.py:

# search_db.py
import sqlite3, argparse, csv, math, os
from typing import Tuple, List

def connect(dbname: str):
    if not os.path.exists(dbname):
        raise SystemExit(f"❌ 找不到資料庫:{dbname},請先執行 Day 6 匯入。")
    return sqlite3.connect(dbname)

def build_where(keyword: str) -> Tuple[str, tuple]:
    kw = keyword.strip()
    if not kw:
        return "1=1", tuple()
    like = f"%{kw}%"
    return "(text LIKE ? OR url LIKE ?)", (like, like)

def search(dbname: str, keyword: str, page: int, size: int, order: str):
    con = connect(dbname)
    cur = con.cursor()

    where_sql, params = build_where(keyword)

    order_sql = "ORDER BY id DESC" if order == "id_desc" else "ORDER BY id ASC"

    # 總數
    cur.execute(f"SELECT COUNT(*) FROM links WHERE {where_sql}", params)
    total = cur.fetchone()[0]

    # 分頁
    offset = (page - 1) * size
    cur.execute(
        f"SELECT id, text, url FROM links WHERE {where_sql} {order_sql} LIMIT ? OFFSET ?",
        (*params, size, offset),
    )
    rows = cur.fetchall()
    con.close()
    return total, rows

def to_csv(rows: List[tuple], out: str):
    with open(out, "w", newline="", encoding="utf-8") as f:
        w = csv.writer(f)
        w.writerow(["id", "text", "url"])
        for r in rows:
            w.writerow(r)
    print(f"✅ 已輸出 CSV:{out}({len(rows)} 筆)")

def main():
    ap = argparse.ArgumentParser(description="在 SQLite 中搜尋爬蟲結果(關鍵字/分頁/匯出)")
    ap.add_argument("--db", default="crawler.db", help="SQLite 資料庫(預設 crawler.db)")
    ap.add_argument("--q", default="", help="關鍵字,會同時比對 text 與 url(LIKE 查詢)")
    ap.add_argument("--page", type=int, default=1, help="第幾頁(從 1 開始)")
    ap.add_argument("--size", type=int, default=10, help="每頁筆數")
    ap.add_argument("--order", choices=["id_desc", "id_asc"], default="id_desc", help="排序方式")
    ap.add_argument("--out", help="輸出 CSV 檔名(可選)")
    args = ap.parse_args()

    total, rows = search(args.db, args.q, args.page, args.size, args.order)
    pages = max(1, math.ceil(total / args.size))

    # 螢幕輸出
    print(f"🔎 查詢:「{args.q or '(全部)'}」 | 總筆數:{total} | 第 {args.page}/{pages} 頁(每頁 {args.size} 筆)")
    for (id_, text, url) in rows:
        short = (text or "").strip()
        if len(short) > 60:
            short = short[:60] + "..."
        print(f"[{id_:>4}] {short} - {url}")

    # 匯出
    if args.out:
        to_csv(rows, args.out)

if __name__ == "__main__":
    main()

怎麼用

  1. 查全部(預設第 1 頁、每頁 10 筆)
python search_db.py

  1. 用關鍵字查標題或網址(模糊比對)
python search_db.py --q python

  1. 分頁(第 2 頁、每頁 5 筆)
python search_db.py --q python --page 2 --size 5

  1. 改排序(id 由小到大)
python search_db.py --q ithelp --order id_asc

  1. 把查詢結果匯出成 CSV
python search_db.py --q job --out search_job.csv

實作:
https://ithelp.ithome.com.tw/upload/images/20250923/20169368aFvrqjr1aY.png
https://ithelp.ithome.com.tw/upload/images/20250923/20169368iRI7biH8he.png
小貼士
關鍵字查詢是 LIKE,會比對 text 與 url。
如果你昨天 save_to_db.py 沒有跑,先匯入再來查:

python save_to_db.py --csv clean_links.csv


圖片
  熱門推薦
圖片
{{ item.channelVendor }} | {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言