各位前輩大家好><
想要爬取goodinfo裡“股價日線圖”內的資訊
#-------------傳送門-------------#
https://goodinfo.tw/StockInfo/ShowK_Chart.asp?STOCK_ID=1101&CHT_CAT2=DATE
#-------------------------------#
得知檔案是使用canvas繪出
資料在javascript的函數裡
使用requests的get方法
並加入header
程式碼
header= {
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.70 Safari/537.36"
,"cache-control": "private"
,"content-encoding": "gzip"
,"content-type": "text/html"
,"date": "Mon, 04 Nov 2019 07:33:39 GMT"
,"server": "Microsoft-IIS/10.0"
,"status": "200"
,"vary": "Accept-Encoding"
,"authority": "goodinfo.tw"
,"method": "GET"
,"path": "/StockInfo/ShowK_ChartData.asp?STOCK_ID=1101&STOCK_NM=%E5%8F%B0%E6%B3%A5&MARKET_CAT=%E4%B8%8A%E5%B8%82&CHT_NM=%E5%80%8B%E8%82%A1K%E7%B7%9A%E5%9C%96&CHT_CAT=DATE&SESSION_VAL=050%2E5607373"
,"scheme": "https"
,"accept": "*/*"
,"accept-encoding": "gzip, deflate, br"
,"accept-language": "zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7"
,"cookie": "CLIENT%5FID=20191013001631468%5F36%2E231%2E68%2E131; SCREEN_SIZE=WIDTH=1440&HEIGHT=900; _ga=GA1.2.93133782.1570897053; __gads=ID=3bf659387a560e46:T=1570897053:S=ALNI_Maxxp2EzIK3i9_a-5kLoTGzp6HiJw; _gid=GA1.2.284279775.1572778526; GOOD%5FINFO%5FSTOCK%5FBROWSE%5FLIST=20%7C2886%7C2227%7C2207%7C2308%7C6220%7C2732%7C8424%7C3557%7C2514%7C6119%7C2233%7C1418%7C3579%7C8464%7C9915%7C2463%7C6579%7C1704%7C2315%7C1101; SL_GWPT_Show_Hide_tmp=1; SL_wptGlobTipTmp=1"
,"referer": "https://goodinfo.tw/StockInfo/ShowK_Chart.asp?STOCK_ID=1101&CHT_CAT2=DATE"
,"sec-fetch-mode": "no-cors"
,"sec-fetch-site": "same-origin"
}
res = re.get("https://goodinfo.tw/StockInfo/ShowK_ChartData.asp?STOCK_ID=1101&STOCK_NM=%E5%8F%B0%E6%B3%A5&MARKET_CAT=%E4%B8%8A%E5%B8%82&CHT_NM=%E5%80%8B%E8%82%A1K%E7%B7%9A%E5%9C%96&CHT_CAT=DATE&SESSION_VAL=050%2E5607373",headers = header)
res.encoding = "UTF-8"
print(res.text)
out:""
輸出是完全空白的!
請問我還需要加入哪些元素
才能得到此動態圖表的資訊呢?
另外我使用selenium的chromedriver後,用id查詢後得到的結果為這兩句話
如果解方式使用selenium,那麼我該如何做呢?!
感謝各位前輩大大指點迷津!
code
import base64
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from time import sleep
def crawler(StockID):
options = Options()
options.add_argument("--headless")
url = 'https://goodinfo.tw/StockInfo/ShowK_Chart.asp?STOCK_ID={}&CHT_CAT2=DATE'.format(StockID)
driver = webdriver.Chrome(options=options)
driver.get(url)
sleep(5)
canvas = driver.find_element_by_css_selector('#StockCanvas')
canvas_base64 = driver.execute_script("return arguments[0].toDataURL('image/png').substring(21);", canvas)
canvas_png = base64.b64decode(canvas_base64)
with open(r"{}.png".format(StockID), 'wb') as f:
f.write(canvas_png)
driver.close()
crawler('1101')
不喜歡 sleep 5 秒的話
就自行研究什麼時候 canvas 會完全載入
請問要怎麼取得裡面的價格資訊呢?!
只要價格
隨便一個 lib 就能爬了
import requests
url = 'https://goodinfo.tw/StockInfo/ShowK_Chart.asp?STOCK_ID=1101&CHT_CAT2=DATE'
headers = {
'user-agent': 'Mozilla/5.0 (Macintosh Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36',
}
r = requests.get(url, headers=headers)
r.encoding = 'utf8'
print(r.text)
”https://goodinfo.tw/StockInfo/ShowK_Chart.asp“ 中的資料只有一年內的,而且沒有下市股票資料。
”https://goodinfo.tw/StockInfo/ShowK_ChartData.asp“ 中(圖表透過函數 呼叫數據庫)的資料有完整的股價資料以及已經下市的股票。
所以才想要爬圖表中的資料><
例如:榮化1704
https://goodinfo.tw/StockInfo/ShowK_ChartData.asp?STOCK_ID=1101&STOCK_NM=%E5%8F%B0%E6%B3%A5&MARKET_CAT=%E4%B8%8A%E5%B8%82&CHT_NM=%E5%80%8B%E8%82%A1K%E7%B7%9A%E5%9C%96&CHT_CAT=DATE&SESSION_VAL=050%2E5607373
我這邊看是沒資料喔 應該是這問題