在驗證一個IP是否可用時,http協定的IP雖然可以執行,卻無法代理
https協定的IP則是完全無法執行
想請教各位該使用何種方法來驗證IP是不是能作為proxy IP使用?
httpbin = 'https://httpbin.org/ip'
headers = {"User-Agent": UserAgent().random}
if meta:
proxies = {meta['scheme']: meta['proxy']}
print(proxies)
session = requests.session()
try:
resource = requests.get(httpbin, headers=headers, proxies=proxies, timeout=3)
sleep(1)
except TimeoutError:
print('except')
pass
except requests.ConnectionError as e:
print(" Failed to open url")
print(e)
pass
print(resource.text)
http(可連線,但結果沒有被代理成功):
{'http': 'http://222.95.240.191:3000'}
{
"origin": "117.19.193.171"
}
https(無法連線):
{'https': 'https://183.143.49.23:8118'}
Failed to open url
HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /ip (Caused by ConnectTimeoutError(<urllib3.connection.VerifiedHTTPSConnection object at 0x000001F6ED7F1AC8>, 'Connection to 183.143.49.23 timed out. (connect timeout=3)'))
{'https': 'https://139.196.13.63:3128'}
Failed to open url
HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /ip (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000001F6ED809860>: Failed to establish a new connection: [WinError 10061] 無法連線,因為目標電腦拒絕連線。')))
然而另一方面若使用scrapy.request則大多數http的IP皆可驗證成功並能順利爬取網站,想請教原因!
(猜測https的IP大多無法用來代理,但http的IP卻無法用requests驗證成功)