1

## dataFrame技巧九式- 讀、寫

dataFrame技巧總介紹-
python之pandas模組dataFrame九大技巧總覽- 增刪查改讀寫切併排

csv, excel是常見的檔案類型，

# 資料準備- 隨機創建資料表

``````import random
import pandas as pd
import numpy as np

def random_str(length=3, miss_rate = 0):
assert 0 <= miss_rate <= 1, 'probability must in [0,1]'
miss = random.choices(range(0,2), weights= [1-miss_rate, miss_rate])[0]
return None if miss else ''.join([chr(ord('a')+random.randrange(26)) for _ in range(length)])

def random_num(low, high, miss_rate = 0):
# miss_rate between 0,1
assert 0 <= miss_rate <= 1, 'probability must in [0,1]'
miss = random.choices(range(0,2), weights= [1-miss_rate, miss_rate])[0]
return np.nan if miss else random.randint(low,high)

print(miss)

def random_df(num):
department = ['A','B','C', None]
data = {"student_id":[random_num(10000,20000) for _ in range(num)],
"student_name":[random_str() for _ in range(num)],
"department":[random.choice(department) for _ in range(num)]}
return pd.DataFrame.from_dict(data)

random.seed(30) #為了確保每次執行程式生成的資料相同，此設固定的random seed
df1 = random_df(5)
print("原始資料:")
print(df1)
``````

``````原始資料:
student_id student_name department
0       14738          qha          B
1       13440          ftq          C
2       10794          ulr          A
3       12196          mah          C
4       17561          viy          B
``````

# 寫csv檔、excel檔範例

``````df1.to_excel('table.xls')
df1.to_csv('data.csv', encoding = 'big5')
df1.to_csv('data_no_index.csv', encoding = 'utf-8', index = False)
``````

`encoding`為編碼方式，

`index`預設是`True`

# 讀csv檔、excel檔範例

``````df2 = pd.read_excel('table.xls')
print(df2)
df3 = pd.read_csv('data.csv', encoding='big5')
print(df3)
df4 = pd.read_csv('data_no_index.csv', encoding='utf-8')
print(df4)
``````

``````   student_id student_name department
0       14738          qha          B
1       13440          ftq          C
2       10794          ulr          A
3       12196          mah          C
4       17561          viy          B

Unnamed: 0  student_id student_name department
0           0       14738          qha          B
1           1       13440          ftq          C
2           2       10794          ulr          A
3           3       12196          mah          C
4           4       17561          viy          B

student_id student_name department
0       14738          qha          B
1       13440          ftq          C
2       10794          ulr          A
3       12196          mah          C
4       17561          viy          B
``````