iT邦幫忙

0

Python 爬蟲登入網頁拿資料

  • 分享至 

  • xImage

最近用python寫模擬測試登入學校圖書館網站,但有時候拿的到網站data,有時候拿不到,不知道是甚麼原因,cookie和authorize我都有抓取的方法進行抓去,header也按照Request HEADERS
部分程式碼:

res2 = rs.get('https://uco-mcu.primo.exlibrisgroup.com/primaws/rest/priv/myaccount/loans?bulk=10&lang=zh-tw&offset=1&type=active', headers=headers, verify=False,timeout=10)
data = res2.text
a = json.loads(data)
print(a)
loan = len(a['data']['loans']['loan'])
if(loan == 0):
    print("你沒有借書")
else:
    print(loan)
    for i in range(loan):
        print("書名:"+a['data']['loans']['loan'][i]['title'])
        print("到期日:"+a['data']['loans']['loan'][i]['duedate'])

出錯:

{'beaconO22': '0', 'status': 'failed', 'reply-code': '0002', 'reply-text': 'The patron ID is invalid', 'data': None}
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-00176e7ea317> in <module>
     58 a = json.loads(data)
     59 print(a)
---> 60 loan = len(a['data']['loans']['loan'])
     61 if(loan == 0):
     62     print("你沒有借書")

TypeError: 'NoneType' object is not subscriptable
dragonH iT邦超人 5 級 ‧ 2020-07-18 16:10:39 檢舉
沒辦法測沒辦法幫
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

1 個回答

1
dragonH
iT邦超人 5 級 ‧ 2020-07-18 17:19:18
最佳解答

code

import requests
import json

loginUrl = 'https://uco-mcu.primo.exlibrisgroup.com/primaws/suprimaLogin'
queryUrl = 'https://uco-mcu.primo.exlibrisgroup.com/primaws/rest/priv/myaccount/loans?bulk=10&lang=zh-tw&offset=1&type=active'

def readUserDataJson():
    with open('userdata.json' , 'r') as userdata:
        return json.loads(userdata.read())

def getLoginData(userData):
    response = requests.post(loginUrl, data = userData)
    return {
        "token": response.json()['jwtData'],
        "X-Persist": response.cookies['X-Persist']
    }

def queryLoanRecords(loginData):
    return requests.get(queryUrl, headers = { "Authorization": f'Bearer {loginData["token"]}', "cookie": f"X-Persist={loginData['X-Persist']}"})

userData = readUserDataJson()
loginData = getLoginData(userData)
response = queryLoanRecords(loginData)
print(response.text)

result

{
    "beaconO22": "564",
    "status": "ok",
    "reply-code": "0000",
    "reply-text": "ok",
    "data": {
        "loans": {
            "showmore": [
                "N"
            ],
            "loan": [
                {
                    "year": "2018",
                    "ilsinstitutionname": "U12 Network",
                    "ilsinstitutioncode": "886UCO_NETWORK",
                    "duehour": "2359",
                    "mainlocationname": "臺北館",
                    "mainlocationcode": "SL",
                    "secondarylocationname": "1F中文書庫",
                    "secondarylocationcode": "SL1FC",
                    "itemcategoryname": "圖書",
                    "itemcategorycode": "B",
                    "itemstatusname": "外借",
                    "duedate": "20200925",
                    "title": "Firebase開發實務 /",
                    "mmsid": "99321750105911",
                    "callnumber": "312.952 8355 2018",
                    "itemid": "2384714920005914",
                    "loanstatus": "正常",
                    "loandate": "20200715",
                    "itembarcode": "C0754581",
                    "author": "葉海亞維",
                    "renew": "Y",
                    "alerts": []
                },
           ...

除了 token

還需要帶上一個 cookie X-Persist

才能完成 request

附上 userdata.json

{
    "authenticationProfile": "mcu_ldaps",
    "username": "你的username",
    "password": "你的password",
    "institution": "886UCO_MCU",
    "view": "886MCU_INST"
}

我要發表回答

立即登入回答