import pandas as pd
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
print("Setup Complete")
Setup Complete
我們使用pd.read_csv
來讀取檔案
# Path of the file to read
flight_filepath = "./flight_delays.csv"
# Read the file into a variable flight_data
flight_data = pd.read_csv(flight_filepath, index_col="Month")
因為month這個column並不是日期的格式,因此在讀取的時候,不用加上parse_dates=True
因為資料集很小,可以只用一行就印出全部的資料
# Print the data
flight_data
將Airline code NK的班機做成bar chart
# Set the width and height of the figure
plt.figure(figsize=(10,6))
# Add title
plt.title("Average Arrival Delay for Spirit Airlines Flights, by Month")
# Bar chart showing average arrival delay for Spirit Airlines Flights by month
sns.barplot(x=flight_data.index, y=flight_data['NK'])
# Add label for vertical axis
plt.ylabel("Arrival delay (minutes)")
Text(0, 0.5, 'Arrival delay (minutes)')
在bar chart中,有幾個主要的components
sns.barplot
- 告訴Notebook要創造一個bar chartx=flight_data.index
- 選擇index當作x軸的資料y=flight_data['NK']
- 要用什麼資料來當作bar chart的高度flight_data['month']
下方的程式碼是將flight_data
的資料做成heatmap
# Set the width and height of the figure
plt.figure(figsize=(14,7))
# Add title
plt.title("Average Arrival Delay for Each Airline, by Month")
# Heatmap showing average arrival delay for each airline by month
sns.heatmap(data=flight_data, annot=True)
# Add label for horizontal axis
plt.xlabel("Airline")
Text(0.5, 42.0, 'Airline')
heatmap也包含三個組件
sns.heatmap
- 告訴notebook要創建heatmapdata=flight_data
- 告訴heatmap要使用甚麼資料annot=True
- 將值顯示在heatmap中,若沒有使用的話,只看的到顏色,看不到數值