iT邦幫忙

0

眉眉角角Day29 抓取網路上的資料 被拒絕存取

  • 分享至 

  • xImage

Hi, Andy,
I just learned Excel VBA 的眉眉角角Day29:如何抓取網路上的資料?以issuu.com為例 Call GetISSUU("http://issuu.com/grupobcm/docs/woman30". This URL WebSite WinHttpReq.Open "GET", ISSUU_URL, False
WinHttpReq.send
被拒絕存取 ? 請問 是否可以 堤供 其他類似的 URL of WebSite 可以Learn step by step?
B. Rgds.,
Robert Chen

blanksoul12 iT邦研究生 5 級 ‧ 2022-12-06 17:44:30 檢舉
看多一個教程 https://club.excelhome.net/thread-1159783-1-1.html
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

1 個回答

0
blanksoul12
iT邦研究生 5 級 ‧ 2022-12-07 17:40:26

你想拿什麼圖,給個流程才可.

看更多先前的回應...收起先前的回應...

Hi, Andy,

我找到了其他類似的 URL of WebSite "https://www.pinterest.com/oxxostudio/outdoor/" 允許存取 請問可以教我 如何抓取 網頁上的圖片 Learn step by step 如同 Excel VBA 的眉眉角角Day29?

希望 學習

  1. 連到首頁,如何檢視, 解析網頁 第一頁內容的JPG, 單獨取出路徑後開啟,確認檔案位置正確

  2. 套用例 Call GetISSUU("https://www.pinterest.com/oxxostudio/outdoor/")
    WinHttpReq.Open "GET", ISSUU_URL, False
    WinHttpReq.send

Sub doGetISSUU()
Call GetISSUU("https://www.pinterest.com/oxxostudio/outdoor/")
MsgBox "Done!"
End Sub

  1. 如何 修改 子程式,用來讀取網頁內容,並取得相關變數,然後呼叫下載的副程式繼續處理下子事宜:

Function GetISSUU(ISSUU_URL As String)
Dim myURL As String
Dim WinHttpReq As Object
Dim strBody As String
Dim strDocumentId As String, iPageCount As Integer, strTitle As String
Dim strPath As String
Dim strFileName As String
strPath = "D:\temp\ISSUU"
Dim iDocumentId_Start As Integer, iDocumentId_End As Integer
Dim iPageCount_Start As Integer, iPageCount_End As Integer
Dim iTitle_Start As Integer, iTitle_End As Integer

'使用Microsoft.XMLHTTP物件,傳送網址(ISSUU_URL)給對方,然後取回(GET)回傳資料
Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
'WinHttpReq.Open "GET", myURL, False, "username", "password"
WinHttpReq.Open "GET", ISSUU_URL, False
WinHttpReq.send
    '將回傳資料放到strBody變數
strBody = WinHttpReq.responseText
    '確認回傳的狀態是否正常,200代表正常
If WinHttpReq.Status = 200 Then
    'Debug.Print strBody
            '用InStr取出所需變數
    iDocumentId_Start = InStr(strBody, """documentId"":") + 14
    iDocumentId_End = InStr(Mid(strBody, iDocumentId_Start, Len(strBody)), ",") + iDocumentId_Start
    iPageCount_Start = InStr(strBody, """pageCount"":") + 12
    iPageCount_End = InStr(Mid(strBody, iPageCount_Start, Len(strBody)), ",") + iPageCount_Start
    
    iTitle_Start = InStr(strBody, """title"":") + 9
    iTitle_End = InStr(Mid(strBody, iTitle_Start, Len(strBody)), ",") + iTitle_Start
    strDocumentId = Mid(strBody, iDocumentId_Start, iDocumentId_End - iDocumentId_Start - 2)
    iPageCount = Mid(strBody, iPageCount_Start, iPageCount_End - iPageCount_Start - 1)
    strTitle = Mid(strBody, iTitle_Start, iTitle_End - iTitle_Start - 2)
        'Debug.Print strTitle
End If

'檔名以目前時間取名
strFileName = Format(Now, "yyyymmdd-hhmmss")
     Debug.Print strFileName & ": " & strTitle
    '呼叫下載副程式
Call Download_ISSUU_File(strDocumentId, iPageCount, strPath, strFileName)
    '執行PDFCreator內建的Images2PDF子程式,將JPEG檔包成PDF檔
Call RunCmd("""" & Environ("ProgramFiles") & "\PDFCreator\Images2PDF\Images2PDFC.exe" & """" & " /i """ & strPath & "\" & strFileName & "\*.jpg"" /e """      & strPath & "\" & strFileName & ".pdf""", True, 1)

End Function

'' 下載副程式,用來批次下載每頁圖片檔案:
Function Download_ISSUU_File(strDocumentId As String, iPageCount As Integer, strSavePath As String, strFileName As String)
'strDocumentId ISSUU的書本ID
'iPageCount 頁數
'strSavePath 存放路徑
'strFileName 檔案名稱

Dim myURL As String
Dim WinHttpReq As Object
strFileName = Replace(strFileName, "/", "")

MkDir (strSavePath & "\" & strFileName)

'下載每頁圖片檔
For i = 1 To iPageCount
    myURL = "http://image.issuu.com/" & strDocumentId & "/jpg/page_" & Format(i, "0") & ".jpg"

    Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
    'WinHttpReq.Open "GET", myURL, False, "username", "password"
    WinHttpReq.Open "GET", myURL, False
    WinHttpReq.send

    	If WinHttpReq.Status = 200 Then
    		Set oStream = CreateObject("ADODB.Stream")
	        oStream.Open
	        oStream.Type = 1
	        oStream.Write WinHttpReq.responseBody
                   oStream.SaveToFile strSavePath & "\" & strFileName & "\" & strFileName & "_" & Format(i, "000") & ".jpg", 2     ' 1 = 不複寫, 2 = 複寫
	        oStream.Close
	End If
Next

End Function

'' 下載下來的每張圖片檔

B. Rgds.,
Robert Chen

blanksoul12 iT邦研究生 5 級 ‧ 2022-12-09 09:45:25 檢舉

https://www.pinterest.com/oxxostudio/outdoor/
這個比較覆雜,一不可用IE,二是動態運算加圖

Thanks a lot for your kind reply,

我看到網頁上, 有下列URL, 試過幾種方法抓圖: 有部分成功, 有不成功,程式如下:

請問此URL,抓圖 : 程式如何寫才能成功抓圖? (不好意思, 又麻煩您了)

picture = "https://i.pinimg.com/236x/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.jpg 1x, https://i.pinimg.com/474x/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.jpg 2x, https://i.pinimg.com/736x/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.jpg 3x, https://i.pinimg.com/originals/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.png 4x"
' Fail  'ActiveSheet.Pictures.Insert (picture)

Sub CommandBunClick()
Dim picture As String
picture = "https://i.pinimg.com/236x/39/eb/1d/39eb1dc594a91c4c1280dcf9d1b1237f.jpg"
ActiveSheet.Pictures.Insert (picture) 'successed
picture = "https://i.pinimg.com/236x/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.jpg"
ActiveSheet.Pictures.Insert (picture) 'successed

' picture = "https://i.pinimg.com/236x/39/eb/1d/39eb1dc594a91c4c1280dcf9d1b1237f.jpg 1x"
' Fail 'ActiveSheet.Pictures.Insert (picture)
picture = "https://i.pinimg.com/236x/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.jpg 1x, https://i.pinimg.com/474x/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.jpg 2x, https://i.pinimg.com/736x/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.jpg 3x, https://i.pinimg.com/originals/e7/26/d4/e726d4056bd1b77d66399a94c701d5a2.png 4x"
' Fail  'ActiveSheet.Pictures.Insert (picture)

picture = "https://i.pinimg.com/236x/39/eb/1d/39eb1dc594a91c4c1280dcf9d1b1237f.jpg 2x"
' Fail  'ActiveSheet.Pictures.Insert (picture)

picture = "https://i.pinimg.com/236x/39/eb/1d/39eb1dc594a91c4c1280dcf9d1b1237f.jpg 3x"
' Fail  'ctiveSheet.Pictures.Insert (picture)

End Sub

blanksoul12 iT邦研究生 5 級 ‧ 2022-12-12 15:05:31 檢舉

我上面有給出教程.

我要發表回答

立即登入回答