import pandas as pd
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
print("Setup Complete")
Setup Complete
# Path of the file to read
iris_filepath = "./iris.csv"
# Read the file into a variable iris_data
iris_data = pd.read_csv(iris_filepath, index_col="Id")
# Print the first 5 rows of the data
iris_data.head()
可以使用sns.distplot
來繪製直方圖
# Histogram
sns.distplot(a=iris_data['Petal Length (cm)'], kde=False)
/opt/conda/lib/python3.6/site-packages/seaborn/distributions.py:2551: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
<AxesSubplot:xlabel='Petal Length (cm)'>
在sns.distplot
中有兩個參數
a=
選擇一個plot(在這個case中,我們選擇Petal Length (cm))kde=False
在繪製直方圖的時候,通常會繪製另一個圖表(密度圖),不過這邊暫時不需要下一個圖表是密度圖,可以當作是比較緩和的直方圖
# KDE plot
sns.kdeplot(data=iris_data['Petal Length (cm)'], shade=True)
<AxesSubplot:xlabel='Petal Length (cm)', ylabel='Density'>
data=
代表說需要用到的資料shade=
代表說要在圖表下方加上顏色我們並不限制於單一columns的密度圖,也可以創造二維的密度圖
# 2D KDE plot
sns.jointplot(x=iris_data['Petal Length (cm)'], y=iris_data['Sepal Width (cm)'], kind="kde", shade=True)
<seaborn.axisgrid.JointGrid at 0x7f405a194710>
在這個部分我們要將不同種類的圖加在一起
先將資料分為三個部分讀取
# Paths of the files to read
iris_set_filepath = "./iris_setosa.csv"
iris_ver_filepath = "./iris_versicolor.csv"
iris_vir_filepath = "./iris_virginica.csv"
# Read the files into variables
iris_set_data = pd.read_csv(iris_set_filepath, index_col="Id")
iris_ver_data = pd.read_csv(iris_ver_filepath, index_col="Id")
iris_vir_data = pd.read_csv(iris_vir_filepath, index_col="Id")
# Print the first 5 rows of the Iris versicolor data
iris_ver_data.head()
我們使用sns.distplot
三次來顯示三種資料
並且用label=
來將資料的名稱顯示瑜說明欄
# Histograms for each species
sns.distplot(a=iris_set_data['Petal Length (cm)'], label="Iris-setosa", kde=False)
sns.distplot(a=iris_ver_data['Petal Length (cm)'], label="Iris-versicolor", kde=False)
sns.distplot(a=iris_vir_data['Petal Length (cm)'], label="Iris-virginica", kde=False)
# Add title
plt.title("Histogram of Petal Lengths, by Species")
# Force legend to appear
plt.legend()
<matplotlib.legend.Legend at 0x7f405a5ab9b0>
為了要讓說明欄顯示,要加上plt.legend()
我們也可以使用kdeplot來呈現多個圖表
# KDE plots for each species
sns.kdeplot(data=iris_set_data['Petal Length (cm)'], label="Iris-setosa", shade=True)
sns.kdeplot(data=iris_ver_data['Petal Length (cm)'], label="Iris-versicolor", shade=True)
sns.kdeplot(data=iris_vir_data['Petal Length (cm)'], label="Iris-virginica", shade=True)
# Add title
plt.title("Distribution of Petal Lengths, by Species")
Text(0.5, 1.0, 'Distribution of Petal Lengths, by Species')