大家好
我是剛入門python爬蟲的新手
想要詢問一下
import requests
from bs4 import BeautifulSoup
res = requests.get('https://news.sina.com.cn/china/')
res.recoding = 'utf-8'
soup = BeautifulSoup(res.text,'html.parser')
for feed in soup.select('.feed-card-item'):
if len(news.select('h2')) > 0:
h2 = news.select('h2')[0].text
a = news.select('a')[0]['href']
print(h2,a)
為什麼在執行時只會出現,而沒有爬資料下來呢?
==================== RESTART: D:\534098\python\sina網爬蟲.py ====================
就該網址而言,你要爬的資料,
其實就已經被埋在裡面的 JSON 資料裡。
你該做的,不應去解析那被 javascript render 後的結果頁面,
而應直接讀取解析那 JSON 資料。
解析出的資料會類似像這樣:
{
"intime": "1543566305",
"channelid": "1",
"ctime": "1543560435",
"mtime": "1543566286",
"authoruid": "0",
"level": "1",
"vid": "0",
"ipad_vid": "0",
"video_time_length": "0",
"categoryid": "1",
"mediaid": "0",
"columnid": "915",
"subjectid": "76866",
"templateid": "0",
"productid": "0",
"ext_0": "0",
"ext_1": "0",
"ext_2": "0",
"ext_3": "0",
"ext_4": "0",
"docid": "comos:hpevhcm4850489",
"url": "https://news.sina.com.cn/c/2018-11-30/doc-ihpevhcm4850489.shtml",
"urls": "[\"https:\\/\\/news.sina.com.cn\\/c\\/2018-11-30\\/doc-ihpevhcm4850489.shtml\"]",
"wapurl": "http://news.sina.cn/gn/2018-11-30/detail-ihpevhcm4850489.d.html",
"wapurls": "[\"http:\\/\\/news.sina.cn\\/gn\\/2018-11-30\\/detail-ihpevhcm4850489.d.html\"]",
"wapsummary": "",
"title": "蔡英文哭完后 台当局还要在这条路上越走越远",
"stitle": "",
"summary": "",
"intro": "原标题:蔡英文哭完后,台当局还要在这条路上越走越远—— 蔡英文哭了!29日,这条新闻几乎刷爆了网络。在此前一
天举行的民进党中常会上...",
"author": "",
"commentid": "gn:comos-hpevhcm4850489:0",
"video_id": "",
"keywords": "选举,蔡英文,两岸",
"media_name": "参考消息",
"img": [],
"images": [
{
"u": "http://n.sinaimg.cn/news/transform/116/w550h366/20181130/r5aB-hpfyces8815829.jpg",
"w": 550,
"h": 366,
"t": "▲“九合一”选举大败之后,蔡英文宣布辞去党主席职务。"
},
{
"u": "http://n.sinaimg.cn/translate/56/w1060h596/20181130/wPCJ-hpevhcm4850378.jpg",
"w": 1060,
"h": 596,
"t": "▲陈明祺"
},
{
"u": "http://n.sinaimg.cn/translate/661/w859h602/20181130/qaVk-hpevhcm4850410.jpg",
"w": 859,
"h": 602,
"t": "▲马晓光"
}
],
"lids": "1356,1655,1741,1908,2509,2510,2670,2968,2970,2974",
"oid": "166962642",
"mlids": "",
"ext": "0",
"comment_reply": 10,
"comment_show": 2,
"comment_total": 29,
"important": "{\"container_id\":\"66284\",\"pos\":\"\\u8981\\u95fb\",\"widget_name\":\"www\",\"action\":\"up\",\"wxb\":true,\"time\":\"2018-11-30 16:23:03\",\"operator\":\"zhangshen5@staff.sina.com.cn\",\"yw_rank\":\"1\"}"
}