python 開啟headless抓不到值

python

zyxa9527 2020-01-07 10:09:28 ‧ 1759 瀏覽

分享至

在抓取網頁資料遇到一個問題

如果是視窗模式的話可以抓到值
但只要開啟headless就會抓不到值
['1#镍', '107950—113550', '110750', '-2,600']

感覺是資料擷取長度不足問題
想請問有什麼方式解決嗎?

正常:
['1#镍', '107950—113550', '110750', '-2,600', '元/吨', '01-06']
網站 https://www.ccmn.cn/

程式碼

chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
driver = webdriver.Chrome(chrome_options=chrome_options)

def scrp_ccmn01(driver):
    driver.get("https://www.ccmn.cn/")

    #讀取內文報價資料
    tr_elem = driver.find_element_by_xpath('//*[@id="quoPrice_box"]/table')
    for tr in tr_elem.find_elements_by_xpath('//tr'):
        ls = [td.text for td in tr.find_elements_by_xpath('td') 
        if len(td.text.strip()) > 0]
        #print(ls)
        if (len(ls) > 0) and ('1#镍' in ls[0]):
            break

登入發表討論

熱門推薦

{{ item.channelVendor }} | {{ item.webinarstarted }} |

直播中

1 個回答

marlin12

iT邦研究生 5 級 ‧ 2020-01-07 22:23:20

最佳解答

使用headless的chrome時，是需要設定相關的視窗尺寸。
嘗試加上這行

chrome_options.add_argument('window-size=1024x768')

回應 1
分享
檢舉

zyxa9527 iT邦新手 5 級 ‧ 2020-01-13 08:03:05 檢舉

可以了感謝!

登入發表回應

我要發表回答

立即登入回答

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 17th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# linux windows server css react

AI會議轉錄如何盡可能縮小明文攻擊面？

IT邦幫忙

python 開啟headless抓不到值

1 個回答

我要發表回答

標記使用者