今天我們來補充自然語言處理中的一個很重要的概念:Named Entity Recognition(NER)。 一般翻譯為命名實體辨識、命名實體識別,或也有人翻成專有名詞辨識。在自然語言處理中,產品、地點和人等等指涉現實世界的對象稱為命名實體,從文本中提取它們,這個動作就稱為命名實體識別。
我們來看看 Hugging Face 如何簡單快速的來做 NER 吧!
sample_text = """
Alistair Darling has been forced to consider a second bailout for banks as the lending drought worsens.
The Cancellor will decide tithin weeks whether to pump billions more into the economy as evidence mounts that the 37 billion part-nationalisation last yearr has failed to keep credit flowing,
Mr Darling, the former Liberal Democrat chancellor, admitted that the situation had become critical but insisted that there was still time to turn things around.
He told the BBC that the crisis in the banking sector was the most serious problem facing the economy but also highlighted other issues, such as the falling value of sterling and the threat of inflation.
"The worst fears about the banking crisis seem not to be panning out," he said, adding that there had not been a single banker arrested or charged over the crash.
"The economy, the economy"
Mr Darling said "there's been a very, very strong recovery" since the autumn of 2008.
"There are very big problems ahead of us, not least of which is inflation. It is likely to be a very high inflation rate. "
The economy is expected to grow by 0.3% in the quarter to the end of this year.
"""
from transformers import pipeline
import pandas as pd
ner = pipeline("ner")
outputs = ner(sample_text)
pd.DataFrame(outputs)
會得到下面圖片中的結果:
以上就是補充 NLP 中很重要的 NER 的概念,明天我們來做 Q&A 吧!