想請各位高手幫忙
第1個問題請問如何用python將爬出來的文字放入excel中呢? 感恩
第2個問題請問如何用python將fb貼文的顯示更多自動點開呢?感恩
程式碼如下:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
import time
options = Options()
options.add_argument("--disable-notifications")
chrome = webdriver.Chrome()
chrome.get("https://www.facebook.com/")
email = chrome.find_element_by_id("email")
password = chrome.find_element_by_id("pass")
email.send_keys('xxxxx')
password.send_keys('xxxxx')
password.submit()
time.sleep(3)
防止跳出通知
chrome.get('https://www.facebook.com/search/top?q=%E7%AB%B9%E8%BC%AA%E9%9B%BB%E5%8B%95%E8%BB%8A')
from selenium import webdriver
options = webdriver.ChromeOptions()
prefs = {'profile.default_content_setting_values':{'notifications': 2}}
options.add_experimental_option('prefs', prefs)
for x in range(1, 4):
chrome.execute_script("window.scrollTo(0,document.body.scrollHeight)")
time.sleep(5)
soup = BeautifulSoup(chrome.page_source, 'html.parser')
print("------------------------我是分隔線------------------------------")
titles = soup.find_all(
"span", class_="d2edcug0 hpfvmrgz qv66sw1b c1et5uql lr9zc1uh a8c37x1j fe6kdd0r mau55g9w c8b282yb keod5gw0 nxhoafnm aigsh9s9 d3f4x2em iv3no6db jq4qci2q a3bd9o3v b1v8xokw oo9gr5id hzawbc8m")
for title in titles:
posts = title.find_all("div", dir="auto")
if len(posts):
for post in posts:
print(post.text)
print("------------------------我是分隔線------------------------------")
建立資料夾
import os
import requests
if not os.path.exists("images"):
os.mkdir("images")
下載圖片
images = soup.find_all(
"img", class_=["i09qtzwb n7fi1qx3 datstx6m pmk7jnqg j9ispegn kr520xx4 k4urcfbm bixrwtb6", "i09qtzwb n7fi1qx3 datstx6m pmk7jnqg j9ispegn kr520xx4 k4urcfbm"])
if len(images) != 0:
for index, image in enumerate(images):
img = requests.get(image["src"])
with open(f"images/img{index+1}.jpg", "wb") as file:
file.write(img.content)
print(f"第 {index+1} 張圖片下載完成!")
等待5秒
time.sleep(5)
關閉瀏覽器
chrome.quit()
希望對你有幫助!
Q1
Selenium Python - Openpyxl and read data from excel
https://www.selenium-tutorial.com/blog/184444/selenium-python-tutorial-excel-reading-pyxl
Q2
看更多留言:即使點擊「Comments」後,也只會顯示部分留言,需要反覆點擊「More」後才能不斷加載資料,但問題在於我們不知道到底要點幾次。
解決方式:透過while迴圈,偵測頁面上是否還有「More comments」的選項能點選,停止的條件沒有「More comments」後才停止迴圈。
https://tlyu0419.github.io/2019/05/01/Crawl-Facebook/