{DAY 28} Matplotlib 繪圖2

2021 iThome 鐵人賽

DAY 28

AI & Data

從資料庫到資料分析視覺化系列第 28 篇

13th鐵人賽

yywoli

團隊Obit0 Studio

2021-10-10 16:11:36

2595 瀏覽

分享至

前言

這篇文章會延伸昨天所學

改變參數的使用

並且畫出更多的圖表

文章內容分別是
3. 折線圖、散佈圖跟柱狀圖
4. 長條圖
5. 繪製在子圖上

折線圖、散佈圖跟柱狀圖

昨天畫過的圖今天來更改參數

將圖表的展示方式做一點調整

figsize調整圖片大小，style改變線條的樣式，color改變線條的顏色，legend指定圖例

group_b.plot(y=["math score","reading score"],figsize=(20,5), style="--", color=["purple","pink"], legend=["math score","reading score"])

接下來將同樣的比較數據放到散佈圖看看每個人成績分佈的情形

plt.scatter(group_b.index ,group_b["math score"],color="purple")
plt.scatter(group_b.index, group_b["reading score"],color="pink")

若是現在想要依照成績高低看分佈的情行

可以使用柱狀圖看看成績的分佈狀況

利用alpha調整透明度，color調整顏色，讓疊加在一起的圖更好比較

plt.title() 設置標題

plt.xlabel()， plt.ylabel.()分別設置x,y軸名稱

plt.legend()設置圖例

plt.hist(group_b["math score"],color="g",alpha=0.3) 
plt.hist(group_b["reading score"],alpha=0.4)
plt.hist(group_b["writing score"],alpha=0.5,color="pink")
plt.title('score distribution')
plt.xlabel("score")
plt.ylabel("numbers")

長條圖

現在如果想看各組在各科的成績平均數比較

可以使用長條圖表示

首先也是利用.groupby()再接著使用.mean()算出平均數

race_ethnicity = df.groupby("race/ethnicity").mean()
race_ethnicity

下面介紹兩種繪圖的方式

直接利用整理過後的dataframe接上.plot.bar()

.plot()之後才在括號裡面調整參數

race_ethnicity.plot(kind='bar',  #圖表類型
                    title='scores in different group',  #標題
                    xlabel='gruoup',  #x軸標題
                    ylabel='score',  #y軸標題
                    legend=True,  # 顯示圖例
                    figsize=(10, 5))  # 設定圖表大小

繪製在子圖上

現在練習將上面畫過的圖

利用子圖的排列

放在同一張畫布上

先建立四張子圖

fig = plt.figure(figsize=(20,10))
axe1 = fig.add_subplot(2, 2, 1) 
axe2 = fig.add_subplot(2, 2, 2) 
axe3 = fig.add_subplot(2, 2, 3) 
axe4 = fig.add_subplot(2, 2, 4)

分別將四張圖表放上去

子圖若要加上標題有兩種方式

ax.title.set_text(" ")
ax.set_title(" ")

#子圖一放上性別佔整體的比例
axe1.pie(numbers_of_gender, labels=type_of_gender,autopct="%0.2f%%")
axe1.title.set_text("portion of gender")

#子圖二放上各組別佔整體的比例
axe2.pie(amounts, labels=category,autopct="%0.2f%%")
axe2.title.set_text("portion of groups")

#子圖三放上各組在各科的成績平均數比較
axe3.set_title("scores on different group")
race_ethnicity.plot.bar(ax=axe3)

#子圖四看整體資料依照成績高低的分佈
axe4.hist(df["math score"],color="g",alpha=0.3) #記得調整透明度
axe4.hist(df["reading score"],alpha=0.4)
axe4.hist(df["writing score"],alpha=0.6,color="pink")
score_labels=["math score","reading score","writing score"] 
axe4.legend(labels=score_labels)
axe4.set_title("scores distribution of all")