iT邦幫忙

0

php curl 抓取網頁 時好時壞

  • 分享至 

  • xImage

本來都能夠正常抓取
但前幾天開始,變的時好時壞
有時正常運行,有時跑了很久都跑不出來
但用瀏覽器訪問都正常

狀況網址之一: http://report.penghu.gov.tw/OpenFront/report/report_detail.jsp?sysId=C105AQ022

想請問可能是哪方面的原因
謝謝

看更多先前的討論...收起先前的討論...
catding iT邦新手 5 級 ‧ 2016-08-16 02:46:13 檢舉
curl_getinfo()的輸出

Array ( [url] => http://report.penghu.gov.tw/OpenFront/report/report_detail.jsp?sysId=C105AQ022 [content_type] => [http_code] => 0 [header_size] => 0 [request_size] => 237 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 30.002726 [namelookup_time] => 0.004213 [connect_time] => 0.014619 [pretransfer_time] => 0.014656 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => -1 [upload_content_length] => 0 [starttransfer_time] => 0 [redirect_time] => 0 [redirect_url] => [primary_ip] => 163.29.98.42 [certinfo] => Array ( ) [primary_port] => 80 [local_ip] => 103.17.8.52 [local_port] => 46143 )
catding iT邦新手 5 級 ‧ 2016-08-16 02:46:48 檢舉
curl_getinfo()的輸出

Array
(
[url] => http://report.penghu.gov.tw/OpenFront/report/report_detail.jsp?sysId=C105AQ022
[content_type] =>
[http_code] => 0
[header_size] => 0
[request_size] => 237
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 30.002726
[namelookup_time] => 0.004213
[connect_time] => 0.014619
[pretransfer_time] => 0.014656
[size_upload] => 0
[size_download] => 0
[speed_download] => 0
[speed_upload] => 0
[download_content_length] => -1
[upload_content_length] => 0
[starttransfer_time] => 0
[redirect_time] => 0
[redirect_url] =>
[primary_ip] => 163.29.98.42
[certinfo] => Array
(
)

[primary_port] => 80
[local_ip] => 103.17.8.52
[local_port] => 46143
)
fillano iT邦超人 1 級 ‧ 2016-08-16 09:15:43 檢舉
...你要在curl_exec()之後再執行curl_getinfo()吧XD
catding iT邦新手 5 級 ‧ 2016-08-16 12:35:35 檢舉
這就是在curl_exec()執行的.....
catding iT邦新手 5 級 ‧ 2016-08-16 12:50:42 檢舉
更正:
這就是在curl_exec()後執行的.....

主要想知道為何
[http_code] => 0
是否可能是請求時出了什麼問題?
現在請求成功率大約只有50%
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

2 個回答

3
fillano
iT邦超人 1 級 ‧ 2016-08-16 09:18:37

我測試一下,沒看到問題...另外,curl_getinfo()要在curl_exec()之後執行啦,沒真的去連不會取得資訊阿。

簡單的測試:

<?php
$url = 'http://report.penghu.gov.tw/OpenFront/report/report_detail.jsp?sysId=C105AQ022';

$h = curl_init($url);
curl_setopt($h, CURLOPT_RETURNTRANSFER, true);
$c = curl_exec($h);
$i = curl_getinfo($h);
curl_close($h);
echo nl2br(print_r($i, true));

結果看起來算正常:

Array
(
[url] => http://report.penghu.gov.tw/OpenFront/report/report_detail.jsp?sysId=C105AQ022
[content_type] => text/html;charset=UTF-8
[http_code] => 200
[header_size] => 248
[request_size] => 109
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.246052
[namelookup_time] => 0.004252
[connect_time] => 0.095965
[pretransfer_time] => 0.096012
[size_upload] => 0
[size_download] => 12032
[speed_download] => 48900
[speed_upload] => 0
[download_content_length] => -1
[upload_content_length] => -1
[starttransfer_time] => 0.184224
[redirect_time] => 0
[redirect_url] => 
[primary_ip] => 163.29.98.42
[certinfo] => Array
(
)

[primary_port] => 80
[local_ip] => 172.20.10.2
[local_port] => 50278
)
0
koei5113
iT邦新手 5 級 ‧ 2019-02-21 11:13:18

curl_getinfo 只有在成功訪問目標網址時才會有辦法取得正確的資料,
所以正確作法是:

$c = curl_exec($h);
$i = curl_getinfo($h);
if($c === false){
    $i['curl_error'] = curl_error(); // 你可以另外設個變數儲存他,我只是單純懶惰而已。
}

var_dump($curl);

我要發表回答

立即登入回答