昨天的練習是抓取csv內的資料,那今天的練習是要把資料寫進表格內,就用[Day24] 爬蟲實戰演練 - 奇摩電影這個練習題吧~
這裡先附上上次的程式碼
import requests
from bs4 import BeautifulSoup
url = "https://movies.yahoo.com.tw/movie_thisweek.html"
response = requests.get(url)
html = BeautifulSoup(response.text, "html.parser")
info_items = html.find_all("div", {"class":"release_info"})
with open("本週新片.txt", "w", encoding="utf-8") as f:
for item in info_items:
name = item.find("div", {"class":"release_movie_name"}).a.text.strip()
en = item.find("div", {"class":"en"}).a.text.strip()
release_time = item.find("div", {"class":"release_movie_time"}).text.split(':')[-1].strip()
level = item.find("div", {"class":"leveltext"}).span.text
f.write("電影名稱: "+name+'\n')
f.write("英文名稱: "+en+'\n')
f.write("上映時間: "+release_time+'\n')
f.write("期待度: "+level+'\n')
f.write("-"*30+'\n')
print("電影名稱:", name)
print("英文名稱:", en)
print("上映時間:", release_time)
print("期待度:", level)
print("-"*30)
上次是寫進.txt檔裡,這次改寫進.csv檔其實都很像。既然是關於csv的,一開始要先import進來。再開檔時不要忘記把檔名更改,尤其是副檔名!接著建立write物件,就可以使用writerow()方法寫入檔案,一開始先寫入欄位的標題,再透過for迴圈一筆筆將每部電影的資料寫入檔案中。
import csv
with open("本週新片.csv", "w", encoding="utf-8") as f:
write = csv.writer(f)
write.writerow(["電影名稱", "英文名稱", "上映時間", "期待度"])
迴圈裡面的部分長這樣,寫入csv不像txt檔需要一行一行寫入,全部寫進一個list,一行就好了,簡潔有力。
for item in info_items:
name = item.find("div", {"class":"release_movie_name"}).a.text.strip()
en = item.find("div", {"class":"en"}).a.text.strip()
release_time = item.find("div", {"class":"release_movie_time"}).text.split(':')[-1].strip()
level = item.find("div", {"class":"leveltext"}).span.text
write.writerow([name, en, release_time, level]) # 改動這一行
print("電影名稱:", name)
print("英文名稱:", en)
print("上映時間:", release_time)
print("期待度:", level)
print("-"*30)
那麼這樣就完成了,超級無敵簡單的啦
import requests
from bs4 import BeautifulSoup
import csv
url = "https://movies.yahoo.com.tw/movie_thisweek.html"
response = requests.get(url)
html = BeautifulSoup(response.text, "html.parser")
info_items = html.find_all("div", {"class":"release_info"})
with open("本週新片.csv", "w", encoding="utf-8") as f:
write = csv.writer(f)
write.writerow(["電影名稱", "英文名稱", "上映時間", "期待度"])
for item in info_items:
name = item.find("div", {"class":"release_movie_name"}).a.text.strip()
en = item.find("div", {"class":"en"}).a.text.strip()
release_time = item.find("div", {"class":"release_movie_time"}).text.split(':')[-1].strip()
level = item.find("div", {"class":"leveltext"}).span.text
write.writerow([name, en, release_time, level])
print("電影名稱:", name)
print("英文名稱:", en)
print("上映時間:", release_time)
print("期待度:", level)
print("-"*30)