iT邦幫忙

0

python 代理IP驗證問題 scrapy.request vs requests

在驗證一個IP是否可用時,http協定的IP雖然可以執行,卻無法代理
https協定的IP則是完全無法執行
想請教各位該使用何種方法來驗證IP是不是能作為proxy IP使用?

httpbin = 'https://httpbin.org/ip'
        headers = {"User-Agent": UserAgent().random}
        if meta:
            proxies = {meta['scheme']: meta['proxy']}
            print(proxies)
            session = requests.session()
            try:
                resource = requests.get(httpbin, headers=headers, proxies=proxies, timeout=3)
                sleep(1)
            except TimeoutError:
                print('except')
                pass
            except requests.ConnectionError as e:
                print("  Failed to open url")
                print(e)
                pass
            print(resource.text)

http(可連線,但結果沒有被代理成功):

{'http': 'http://222.95.240.191:3000'}
{
  "origin": "117.19.193.171"
}

https(無法連線):

{'https': 'https://183.143.49.23:8118'}
  Failed to open url
HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /ip (Caused by ConnectTimeoutError(<urllib3.connection.VerifiedHTTPSConnection object at 0x000001F6ED7F1AC8>, 'Connection to 183.143.49.23 timed out. (connect timeout=3)'))
{'https': 'https://139.196.13.63:3128'}
  Failed to open url
HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /ip (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000001F6ED809860>: Failed to establish a new connection: [WinError 10061] 無法連線,因為目標電腦拒絕連線。')))

然而另一方面若使用scrapy.request則大多數http的IP皆可驗證成功並能順利爬取網站,想請教原因!
(猜測https的IP大多無法用來代理,但http的IP卻無法用requests驗證成功)

1 個回答

williamvhale
iT邦見習生 0 級 ‧ 2020-02-24 18:11:04
【**此則訊息已被站方移除**】

我要發表回答

立即登入回答