此實作係使用 https://meet.eslite.com/ 網站資料,因無 robots.txt 可遵循。若此範例不妥,再請協助告知。
以取得誠品電影時刻表為例,並將資料整理成以下格式:
[
{
name: "片名",
timetable: [
{
date: "日期",
titme: ["10:00", "12:00"]
},
{
date: "日期",
titme: ["10:00", "12:00"]
},
]
}
]
觀察目標網頁 https://meet.eslite.com/tw/tc/gallery/movieschedule/201803020001 為 Server-Side-Rendering 頁面,抓取資料可用 Axios + cheerio
方式。
.film_list > .box
.box .intro > .left > p
.time-swiper > .swiper-slide > p
.time-swiper > .swiper-slide > ul
如下圖:
const fs = require("fs");
const axios = require("axios");
const cheerio = require("cheerio");
const url = "https://meet.eslite.com/tw/tc/gallery/movieschedule/201803020001";
(async() => {
let res = await axios.get(url);
let $ = cheerio.load(res.data);
let list = [];
$(".film_list .box").each(function(i, elem) {
let name = $(this).find(".intro .left > p").text();
let timetable = [];
$(this).find(".time-swiper .swiper-slide").each(function (j, slide) {
let date = $(slide).find("p").text();
let time = [] ;
$(slide).find("ul li").each(function(k, text) {
time.push($(text).text());
})
timetable.push({date, time});
})
list.push({name, timetable});
});
// 存至 film-timetable.json
fs.writeFileSync("film-timetable.json", JSON.stringify(list));
})()
暫無法確定 https://meet.eslite.com/tw/tc/gallery/movieschedule/201803020001 是否為固定連結,若不是,可採用 Pupeteer 方式,模擬瀏覽器操作,取得相關資料。