上週主要是環境建置和參數測試,這週要開始做一些常見的應用練習。主題會更貼近日常,今天先從最實用的一個功能開始:文字摘要。
為甚麼要做文字摘要?
在日常生活或工作中,我們常常遇到需要快速掌握重點的情境:
實際操作
Hugging Face已經幫我們準備好現成的工具,可以直接用pipeline("summarization") 來做摘要。
我上網查了一篇與AI有關的文章,並擷取其中一段文章,來讓AI幫我做摘要。
資料來源:https://www.ibm.com/think/insights/ai-generated-content
輸入程式碼:
from transformers import pipeline
# 建立摘要 pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
# 測試用的一段短文
text = """
AI-generated content is any type of content, such as text, image, video or audio, which is created by artificial intelligence models.
These models are the result of algorithms trained on large datasets that enable them to produce new content that mimics the characteristics of the training data.
Popular generative AI models—such as ChatGPT, DALL-E, LLaMA and IBM Granite—apply deep learning techniques to generate text, images, audio and video that simulate human creativity.
"""
# 生成摘要
summary = summarizer(text, max_length=60, min_length=20, do_sample=False)
print(summary[0]['summary_text'])
可以發現原本的文章被濃縮成兩句重點摘要,模型確實有抓到重點,但也有點太過精簡導致忽略一些細節的問題。而這就是AI的好處與風險。