iT邦幫忙

第 11 屆 iThome 鐵人賽

DAY 24
0
AI & Data

Hands on Data Cleaning and Scraping 資料清理與爬蟲實作系列 第 24

Day24 Airbnb in Berlin 5/5 the ring zone summary 柏林Airbnb 5/5 蛋黃區房源分析小結

  • 分享至 

  • xImage
  •  

今天針對前幾日視覺化的結果,來篩選一些符合我需求的房源。
Today we will filter out some listings that fit my need based on the visualizations we did in the past few days.

# 載入所需套件 import the packages we need
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt
import seaborn as sns
import plotly as py 

import warnings # 忽略警告訊息 
warnings.filterwarnings("ignore") 

讀入檔案來分析

Read in the file

drop = pd.read_csv('drop.csv') # 讀入昨天存的檔案來分析 read in the file we created yesterday
print('There are', drop.id.nunique(), 'listings in the drop data.')
drop.info() # 查看資料細節 the info of data
drop.head(3) # 叫出前三筆資料看看 print out the top three rows of data

https://ithelp.ithome.com.tw/upload/images/20190925/20119709P9TyRQ11yn.jpg
https://ithelp.ithome.com.tw/upload/images/20190925/20119709Rl99cyeVEL.jpg

個人想要只有一張床的房源,而且要真的床

I only need one bed. And I prefer real bed.

drop_beds = drop.loc[(drop.beds<2)&(drop.beds>0)]
drop_beds.head(3)

https://ithelp.ithome.com.tw/upload/images/20190925/20119709dzkMPIr3Cr.jpg

real_bed = drop_beds["bed_type"] == 'Real Bed'
one_real_bed = drop_beds[real_bed]
one_real_bed.head(3)

https://ithelp.ithome.com.tw/upload/images/20190925/20119709C4nrpjaWnQ.jpg

想要有浴室、並且房東是認證的超級房東

Bathroom needed, and I want a super host.

one_bathroom_super = one_real_bed.loc[(one_real_bed.bathrooms>=1)&(one_real_bed.host_is_superhost=='t')]
one_bathroom_super.head(3)

https://ithelp.ithome.com.tw/upload/images/20190925/20119709VsW2llj3t9.jpg

只要房子跟公寓類型的房源

Only house and apartment.

house_apartment = one_bathroom_super.loc[(one_bathroom_super.property_type=='Apartment') | (one_bathroom_super.property_type=='House')]
house_apartment.head(3)

https://ithelp.ithome.com.tw/upload/images/20190925/20119709KBpdmGyadI.jpg

來看看符合條件的房源在不同郵遞區號地點的價格

Plot out the price of listings that meet our needs with different zipcodes.

sort_price = house_apartment.groupby('zipcode')['price'].median().sort_values(ascending=False).index

plt.figure(figsize=(12,6))
plt.title('zipcode', fontsize=16)
sns.boxplot(y='price', x='zipcode', data=house_apartment, order=sort_price)
plt.xticks(rotation=45)

https://ithelp.ithome.com.tw/upload/images/20190925/20119709wpIbs6QtYK.png

保留價位介於25到100元的房源存成csv檔

Save listing price range from 25 to 100 as a new csv file.

cleaned = house_apartment.loc[(house_apartment.price>=25) & (house_apartment.price<=100)]
cleaned = cleaned.iloc[:, 1:]
cleaned.reset_index(drop=True)
cleaned.head(3)

https://ithelp.ithome.com.tw/upload/images/20190925/201197097j2Dk47Vuo.jpg

cleaned.to_csv('cleaned.csv')

本篇程式碼請參考Github。The code is available on Github.

文中若有錯誤還望不吝指正,感激不盡。
Please let me know if there’s any mistake in this article. Thanks for reading.

Reference 參考資料:

[1] Inside Airbnb

[2] 利用Airbnb來更了解居住城市,以臺北為例 Python實作(上)

[3] Airbnb listings in Berlin


上一篇
Day23 Airbnb in Berlin 4/5 listings analysis 柏林Airbnb 4/5 蛋黃區房源分析
下一篇
Day25 Beautiful Soup Try Out: Stepstone Posting 美麗的湯爬蟲初體驗:達石職缺
系列文
Hands on Data Cleaning and Scraping 資料清理與爬蟲實作30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言