DAY25 Pandas使用dropna刪除NaN part1 (刪除列)

2024 iThome 鐵人賽

DAY 25

佛心分享-IT 人自學之術

走在Pandas資料操縱與分析的路上持續前進系列第 25 篇

16th鐵人賽

٩۹(๑•̀ω•́ ๑)۶

2024-08-30 00:01:53

165 瀏覽

分享至

經過前兩天已經學會了Pandas使用drop的基本用法，
而今天要講的是一個很實用的語法dropna。

當資料量足夠多時，
若是資料中有NaN(Not a Number)時，
可評估是否把整列資料刪除，
而要刪除資料中所有包含NaN的資料時，
Pandas的dropna語法便能快速達到目的。

範例

首先，先建立一個DataFrame結構的資料，
或是有匯入的資料轉成DataFrame結構也行。
這邊為了方便對照，先印出完整的資料來看。

P.S這裡特別放入np.NaN與空字串的資料

studentsData = {
    'studentId': ['001', '002', '003'],
    'Name': ['A', 'B', 'C'],
    'Height': [175, np.NaN , 164],
    'Weight': [80, 45, 75],
    'City': ['New York', 'Los Angeles', '']
}
students = pd.DataFrame(studentsData)
print(students)

印出資料如下

  studentId Name  Height  Weight         City
0       001    A   175.0      80     New York
1       002    B     NaN      45  Los Angeles
2       003    C   164.0      75

刪除含有NaN的列

這裡的語法非常簡單，
在資料後加上.dropna()，
使用方式如下，

print(students.dropna())

印出資料如下，
這裡特別注意到的是含有NaN的列(列index為1)的資料已被刪除，
而含有空字串的列(列index為2)的資料依然存在，
可以知道dropna語法僅會針對NaN資料執行。

  studentId Name  Height  Weight      City
0       001    A   175.0      80  New York
2       003    C   164.0      75

今日結語

今天的dropna語法在整理大量資料時非常實用，
畢竟在龐大資料量要整理時肯定不是件容易的事，
這個語法就能初步的將不需要的資料先做篩選，
請大家務必要學會喔。

DAY24 Pandas使用drop刪除part2 (刪除欄)

DAY26 Pandas使用dropna刪除NaN part2 (刪除欄)

系列文

走在Pandas資料操縱與分析的路上持續前進共 30 篇

RSS系列文訂閱系列文

2 人訂閱

完整目錄

直播研討會

{{ item.channelVendor }} {{ item.webinarstarted }} |

直播中

尚未有邦友留言

立即登入留言

參賽組數

1064 組

團體組數

40 組

累計文章數

22207 篇

完賽人數

600 人

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# windows server linux css react vue.js

IT邦幫忙

走在Pandas資料操縱與分析的路上持續前進系列 第 25 篇