各位好
小弟目前因工作再研究如何使用python 抓取交通部公路總局提供的 XML,
網址:https://thbapp.thb.gov.tw/opendata/vd/one/VDLiveList.xml
目前使用 ElementTree 、 urllib兩個模組去讀取跟分析,因為上述XML是有 namespaces ,
我的 code 如下
import xml.etree.ElementTree as ET
from urllib.request import urlopen
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
ns = {'xmlns': 'http://traffic.transportdata.tw/standard/traffic/schema/'}
response = urlopen('https://thbapp.thb.gov.tw/opendata/vd/one/VDLiveList.xml')
s = response.read().decode('utf-8')
root = ET.fromstring(s)
for vd in root.findall('.//xmlns:VDLives', ns):
vdid = vd.find('*/xmlns:VDID[1]', ns)
print(vdid)
輸出結果:<Element '{http://traffic.transportdata.tw/standard/traffic/schema/}VDID' at 0x101b4e070>
想請教如何獲得指定VDID 以及底層的數據?
例如:VD-11-0020-002-01 and LaneType & speed
#!/usr/bin/python
import xml.etree.ElementTree as ET
from urllib import urlopen
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
response = urlopen('https://thbapp.thb.gov.tw/opendata/vd/one/VDLiveList.xml')
s = response.read().decode('utf-8')
tree = ET.fromstring(s)
for vdlive in tree.findall('*/{http://traffic.transportdata.tw/standard/traffic/schema/}VDLive'):
vdid = vdlive.find('{http://traffic.transportdata.tw/standard/traffic/schema/}VDID').text
data_collect_time = vdlive.find('{http://traffic.transportdata.tw/standard/traffic/schema/}DataCollectTime').text
for lane in vdlive.findall('*//{http://traffic.transportdata.tw/standard/traffic/schema/}Lane'):
lane_id = lane.find('{http://traffic.transportdata.tw/standard/traffic/schema/}LaneID').text
for vehicle in lane.findall('*//{http://traffic.transportdata.tw/standard/traffic/schema/}Vehicle'):
veh_vol = int(vehicle.find('{http://traffic.transportdata.tw/standard/traffic/schema/}Volume').text)
if veh_vol <= 0: # Invalid or uninteresting value
continue
veh_type = vehicle.find('{http://traffic.transportdata.tw/standard/traffic/schema/}VehicleType').text
print((vdid, lane_id, data_collect_time, veh_type, veh_vol))
result:
建議您可以這麼做!官網有提供開發文件
https://github.com/trafficmotc/Sample-code
請再仔細看看喔.