iT邦幫忙

第 11 屆 iThome 鐵人賽

DAY 23
0
Software Development

python 自學系列 第 23

python day23(urllib)

  • 分享至 

  • xImage
  •  

httpbin

httpbin 是一個 Http request、response service,可以使用它來做一些 Http request、response 的測試.

urllib

urllib 是 Python3 的標準網路請求庫。包含了網路資料的 request、response,處理cookie、改變 request header、使用者代理、重導向,認證等的函式 library。

>>> from urllib import request, parse

透過 parse.urlencode 可以把 dictionary 的內容轉成 request 的參數.

>>> url = 'http://httpbin.org/get'
>>> parms = {'k1':'v1' , 'k2':'v2'}
>>> req_string = parse.urlencode(parms)
>>> print(req_string)
k1=v1&k2=v2

然後用request.urlopen發送 request 給 httpbin,然後會收到回傳 HTTPResponse 的 object.

>>> res_obj = request.urlopen(url + '?' + req_string)
>>> type(res_obj)
<class 'http.client.HTTPResponse'>

看回傳的 code

>>> res_obj.status
200

看實際呼叫的 url

>>> res_obj.geturl()
'http://httpbin.org/get?k1=v1&k2=v2'

看回傳的 header 的 Content-Type

>>> res_obj.getheader(name="Content-Type")
'application/json'

透過 read 取得 response 內容

>>> res_string = res_obj.read()
>>> print(res_string)
b'{\n  "args": {\n    "k1": "v1", \n    "k2": "v2"\n  }, \n  "headers": {\n    "Accept-Encoding": "identity", \n    "Host": "httpbin.org", \n    "User-Agent": "Python-urllib/3.7"\n  }, \n  "origin": "106.104.121.159, 106.104.121.159", \n  "url": "https://httpbin.org/get?k1=v1&k2=v2"\n}\n'
>>> type(res_string)
<class 'bytes'>

如果要使用 post 的方式,呼叫 httpbin 參數要帶 post,告訴 httpbin 這次的 request 是使用 post 方式.

>>> url = 'http://httpbin.org/post'

一樣會有要傳遞的參數先使用 parse.urlencode 後再透過 post_data.encode 轉成 byte,但這次要多建立一個 Request 物件,把參數帶給它.

>>> parms = {'k1':'v1' , 'k2':'v2'}
>>> post_data = parse.urlencode(parms)
>>> print(post_data)
k1=v1&k2=v2
>>> type(post_data)
<class 'str'>
>>> post_data = post_data.encode('ascii')
>>> type(post_data)
<class 'bytes'>
>>> print(post_data)
b'k1=v1&k2=v2'
>>> req_obj = request.Request(url, post_data)
>>> type(req_obj)
<class 'urllib.request.Request'>

再透過 urlopen 發送 request 並取回 response,在讀出回傳的內容 :

>>> with request.urlopen(req_obj) as res_obj:
...  print(res_obj.read())
...
b'{\n  "args": {}, \n  "data": "", \n  "files": {}, \n  "form": {\n    "k1": "v1", \n    "k2": "v2"\n  }, \n  "headers": {\n    "Accept-Encoding": "identity", \n    "Content-Length": "11", \n    "Content-Type": "application/x-www-form-urlencoded", \n    "Host": "httpbin.org", \n    "User-Agent": "Python-urllib/3.7"\n  }, \n  "json": null, \n  "origin": "106.104.121.159, 106.104.121.159", \n  "url": "https://httpbin.org/post"\n}\n'

HTTPBasicAuthHandler

當如果有網頁需要帳密認證時,可以透過 HTTPBasicAuthHandler 來認證請求連線.一樣使用 httpbin 來測試呼叫 basic-auth 帳號是 daniel 密碼是 123456.

>>> url = 'http://httpbin.org/basic-auth/daniel/123456'

先建立一個帳密的 manager,然後把網址跟帳密加進去.

>>> request_pwd = request.HTTPPasswordMgrWithDefaultRealm()
>>> request_pwd.add_password(None, url, 'daniel', '123456')

接著使用 manager 建立 HTTPBasicAuthHandler 物件,再使用 request.build_opener 建立 OpenerDirector,就可以使用它來呼叫 request 了.

>>> handler = request.HTTPBasicAuthHandler(request_pwd)
>>> handler_open = request.build_opener(handler
>>> type(handler_open)
<class 'urllib.request.OpenerDirector'>
>>> res_obj = handler_open.open(url)
>>> print(res_obj.read())
b'{\n  "authenticated": true, \n  "user": "daniel"\n}\n'

把 handler_open 帶給 request.install_opener 之後只要使用 request.urlopen 就會使用到到註冊的 handler_open.

>>> request.install_opener(handler_open)
>>> url = 'http://httpbin.org/basic-auth/daniel/123456'
>>> req_obj = request.Request(url)
>>> with request.urlopen(req_obj) as res_obj:
...  print(res_obj.read())
...
b'{\n  "authenticated": true, \n  "user": "daniel"\n}\n'

為了證明是可行的,又重開一個環境,直接呼叫確實是會 UNAUTHORIZED.

>>> url = 'http://httpbin.org/basic-auth/daniel/123456'
>>> req_obj = request.Request(url)
>>> with request.urlopen(req_obj) as res_obj:
...  print(res_obj.read())
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 401: UNAUTHORIZED

上一篇
python day22(concurrency、parallelism)
下一篇
python day24(airflow)
系列文
python 自學30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言