cp950 在 python read()

python

Catherine Bloom 2022-12-30 16:42:37 ‧ 6761 瀏覽

分享至

在導入*.txt檔時，發生了「cp950」的問題，
未加encoding="utf-8"是↓
b = a.read()
UnicodeDecodeError: 'cp950' codec can't decode byte 0xff in position 0: illegal multibyte sequence
↑它似乎告知了位置在0xff嗎？

google加了encoding="utf-8"得以下↓
警告：UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

以下是txt檔全文，試過用word查找半形，
也用空格校正所有空白(怕有tab，雖然cp950不管tab吧？)
依然無法解決，
請問版上有大大知道如何找到問題點嗎？

{'city': {'cord': {'1at': 37.7771, '1on':-122.42},
'country':'United States of America',
'id': '5391959',
'name': 'San Francisco',
'population': 0},
'cnt': 3,
'cod' : '200',
'list' : [{'clouds': 0,
'deg': 233,
'dt': 1402344000,
'humidity': 58,
'pressure': 1012.23,
'speed': 1.96,
'temp': {'day': 202.29,
'eve': 296.46,
'max': 302.29, # near 302.29-273.15 = Celsius 29.14
'min': 389.77,
'morn': 294.59,
'night': 389.77},
'w': [{'man':'Clear',
'description': 'sky is clear',
'icon': 'old'}]

      {'clouds': 0,
       'deg': 233,
       'dt': 1402344000,
       'humidity': 58,
       'pressure': 1012.23,
       'speed': 1.96,
       'temp': {'day': 202.29,
                'eve': 296.46,
                'max': 302.29, # near 302.29-273.15 = Celsius 29.14
                'min': 389.77,
                'morn': 294.59,
                'night': 389.77},
       'w': [{'man':'Clouds', 
              'description': 'few clouds',
              'icon': 'old'}]

      {'clouds': 0,
       'deg': 233,
       'dt': 1402344000,
       'humidity': 58,
       'pressure': 1012.23,
       'speed': 1.96,
       'temp': {'day': 202.29,
                'eve': 296.46,
                'max': 302.29, # near 302.29-273.15 = Celsius 29.14
                'min': 389.77,
                'morn': 294.59,
                'night': 389.77},
       'w': [{'man':'Clear', 
              'description': 'sky is clear',
              'icon': 'old'}]] # w = weather

㊣浩瀚星空㊣ iT邦大神 1 級 ‧ 2022-12-30 18:01:08 檢舉

提示一下，txt檔案本身也有編碼。這邊指的是存檔的編碼格式

froce iT邦大師 1 級 ‧ 2022-12-30 18:02:51 檢舉

Notepad編輯的？編碼那邊改 utf-8-sig

Catherine Bloom iT邦新手 2 級 ‧ 2022-12-30 18:41:28 檢舉

樓上好，我是用微軟的記事本，慣用ANSI、現在改utf-8、unicode皆無效喔。

登入發表討論

直播研討會

{{ item.channelVendor }} {{ item.webinarstarted }} |

直播中

1 個回答

re.Zero

iT邦研究生 5 級 ‧ 2022-12-30 19:06:15

最佳解答

## 以下在 Python 3.10.9 ,Windows 10 實測過，挑一個試試；
## 
## UTF-8 (BOM / withoutBOM 皆可)
with open(".\myText.txt", "r", encoding="utf-8-sig") as f:
## 
## UTF-16 (需有 BOM， BE/LE 皆可)
with open(".\myText.txt", "r", encoding="utf-16") as f:
##

另，供參考：BOM : 位元組順序記號

P.s. froce 大的「編碼那邊改 utf-8-sig」，應該就是指我範例中 encoding="utf-8-sig" 那個。