#Python程式碼問題

python txt文字檔 code

s10955047 2023-02-01 00:16:23 ‧ 17710 瀏覽

分享至

如何讀取txt檔，之後對於檔案中的內容進行分組?
例如:
txt檔中:apple、hair、orange、candy、book
如何使用python將食物挑出?

謝謝~

登入發表討論

熱門推薦

{{ item.channelVendor }} | {{ item.webinarstarted }} |

直播中

2 個回答

JamesDoge

iT邦高手 1 級 ‧ 2023-02-01 07:43:18

最佳解答

如何使用python將食物挑出?

要先進行分組:

# 讀取txt檔
with open('file.txt', 'r') as file:
    content = file.read().splitlines()

# 將檔案中的內容進行分組
foods = [word for word in content if word in ['apple', 'orange', 'candy']]

# 印出分組結果
print(foods)

輸出:

['apple', 'orange', 'candy']

回應 3
分享
檢舉

s10955047 iT邦新手 5 級 ‧ 2023-02-01 14:32:34 檢舉

謝謝~

Oo_花之舞__oO iT邦研究生 5 級 ‧ 2023-02-02 06:39:57 檢舉

記得要選人家最佳解答唷，這是一種禮貌、尊重～
也能讓其他人了解這個是否很有幫助！
提醒一下唷：）

s10955047 iT邦新手 5 級 ‧ 2023-02-03 16:02:17 檢舉

謝謝您的提醒!

登入發表回應

jeffeux

iT邦新手 4 級 ‧ 2023-02-04 04:42:57

其實我有一個可能很愚蠢的問題：
怎麼判斷那些單字是不是食物呀？
我以為要用 BERT 還什麼語言模型去判斷⋯⋯

# 先給一些食物的單字舉例
FOODLIST = [
    "apple",
    "orange",
    "candy",
    "noodle",
    "rice",
    # ... etc. a lot，最好有個一萬筆
]

# 看要用查表還是找模型
AUTO_JUDGE = False

if not AUTO_JUDGE:
    # 直接用查表的，指定哪些單字算食物
    def word_is_food(word):
        return word in FOODLIST
else:
    # 用 NLP 模型自動判斷
    import torch, transformers

    def load_model(modelname="bert-base-uncased"):
        # LOAD a BERT model for example
        # 你當然可以選其他模型
        tokenizer = (transformers
            .AutoTokenizer.from_pretrained(
                modelname))
        model = (transformers
            .BertModel.from_pretrained(
                modelname))
        return model, tokenizer
        
    model, tokenizer = load_model()

    def word_is_food(word):
        """ 判斷單字是不是食物 """
        def similarity(word1, word2):
            """ cosine similarity """
            return ((word1 @ word2) 
                / (word1 @ word1) / (word2 @ word2))

        last_hidden_states = model(
            **(tokenizer(word, return_tensors="pt"))
        ).last_hidden_states[..., 1:].mean()
        
        def embed_foodword(word):
            return model(
                **(tokenizer(word, return_tensors="pt"))
            ).last_hidden_states[..., 1:].mean()

        total_sim = [
            similarity(
                last_hidden_states,
                embed_foodword(foodword),
            )
            for foodword in FOODLIST]
        return (total_sim / len(FOODLIST)) > 0.5


foods = []
with open('file.txt', 'r') as file:
    for line in file:
        word = line.strip()
        if word_is_food(word):
            food_in_file.append(word)

print(foods)

回應 1
分享
檢舉

s10955047 iT邦新手 5 級 ‧ 2023-02-07 20:46:54 檢舉

我沒學過BERT 謝謝分享我會再研究看看這是讓電腦去學習哪些東西是食物嗎?

登入發表回應

我要發表回答

立即登入回答

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 17th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# linux windows server css react

Fortigate DNS Filter 問題

IT邦幫忙