現在我有一個表格長這樣
https://ithelp.ithome.com.tw/upload/images/20221204/201538675f5EmnVka5.jpg
想再問看看如果想以SiteName測站與ItemName、月為單位計算每月污染物平均值的話該怎麼寫,想得到的結果長這樣
https://ithelp.ithome.com.tw/upload/images/20221204/20153867OprvHqiVkc.jpg
目前是寫這樣,不知道該如何修改
https://ithelp.ithome.com.tw/upload/images/20221204/201538675tVHqjAxjt.jpg
本來想說如果樓主有Raw Data的話,寫個程式來測看看樓主的需求。
後來google找空氣監測資料,發現有提供API,也有提供API的說明範例:
https://data.epa.gov.tw/guide
我也順便上網申請使用API,但這個API一次只能下載1000筆資料。
所以要寫個迴圈讓程式下載你需要的資料。
樓主要的資料,其實不需要自己寫程式,環保署就有提供「空氣品質監測月值」:
https://data.epa.gov.tw/dataset/detail/aqx_p_08
下載完,用程式讀出來
import pandas as pd
import requests
import io
x = 0
y = 1000
# 以下的範例是下載2021整年的資料前1000筆
url = f'https://data.epa.gov.tw/api/v2/aqx_p_08?format=csv&offset={x}&limit={y}&api_key=【你的API Key】&filters=monitormonth,GR,202101|monitormonth,LE,202201'
urlData = requests.get(url).content
temp = pd.read_csv(io.StringIO(urlData.decode('utf-8')))
temp
您好,就Code的寫法感覺你是想用groupby的方式直接取各別你想要的欄位的平均值,這邊以模擬範例您所想要的結果給您參考=)
import pandas as pd
df = pd.DataFrame({"Site_ID":["1","1","3","1","1","1","2","2","2","2"],"Site_Name":["TP","TP","TY","TP","TP","TP","KH","KH","KH","KH"],"Month":["1","1","5","7","2","8","4","3","4","9"],"ItemName":["CO","CO3","CO","CO2","CO3","CO","CO3","CO","CO","CO3"],"Concentration":[1,5,57,8,9,4,4,14,97,12]})
df.head(5)
df.groupby(["Site_ID","Site_Name","Month"]).mean()