再前言我先來回答 marlin12 的5點問題
這5點是我再做這些文章中沒還想到的問題。
這些問題也讓我非常敢興趣,再未來的文章中我會收集資料並把他分享給大家
謝謝marlin12的建議
那今天就接序昨天的進度繼續下去吧
我們可以看到我們抓下來的資料長這個樣子
{"category":"Modern Web","title":"瓶子裡裝甚麼藥,使用Flask輕輕鬆鬆打造一個RESTful API-DAY05","author":"kirai","views":"1","herf":"https://ithelp.ithome.com.tw/articles/10200204"},
那我們接著就得想想要怎麼去分析及排序
我想到幾點
有了該怎麼做的步驟後就可以開始實作了
我們延續用上次的檔案
一次分類完成後再儲檔
首先把資料分類
let categorys = [] //儲存分類類別
let categorys_data = {} //儲存分類資料
await data.forEach(post => {
let category = post.category
//先檢查這分類存不存在,不存在就加入分類
//並再categorys_data建立新的分類項目,指定為陣列
if (!categorys.includes(category)) {
categorys.push(category)
categorys_data[category] = []
}
//把資料推進分類
categorys_data[category].push(post)
});
接著我們資料進行依照觀看數進行分類排序
//先跑類別迴圈
for (const category in categorys_data) {
//檢查有沒有這物件
if (categorys_data.hasOwnProperty(category)) {
//posts等於目前分類所有文章
const posts = categorys_data[category];
//下面這邊這是基礎的氣泡排序法,就不多做解釋
for (let i = 0; i < posts.length - 1; i++) {
for (let j = 0; j < posts.length - i - 1; j++) {
if (parseInt(posts[j].views) < parseInt(posts[j + 1].views)) {
let tmp = {}
tmp = posts[j]
posts[j] = posts[j + 1]
posts[j + 1] = tmp
}
}
}
}
}
最後我們就可以呈獻資料了
for (const category in categorys_data) {
if (categorys_data.hasOwnProperty(category)) {
const posts = categorys_data[category];
for (let i = 0; i < posts.length; i++) {
console.log("Most view post at " + posts[i].category + ' => ' + posts[i].author)
console.log('Post name: ' + posts[i].title)
console.log('Post url: ' + posts[i].herf + '\n')
break;
}
}
}
跑出來就長這樣
這不包括自我挑戰組,因為我們網址沒抓到那裡去
10/06 21:30 每個分類最多觀看次數的文章
Most view post at AI & Data => henry.w
Post name: Day 1 Puppeteer 環境建置
Post url: https://ithelp.ithome.com.tw/articles/10199544
Most view post at 影片教學 => 盧卡斯
Post name: How to Install Endless OS Using VMware ?
Post url: https://ithelp.ithome.com.tw/articles/10199498
Most view post at Software Development => eugenechen
Post name: Day 1 - 前言/開發環境準備
Post url: https://ithelp.ithome.com.tw/articles/10199491
Most view post at Modern Web => BY
Post name: Day2 : 安裝 Django 2.1
Post url: https://ithelp.ithome.com.tw/articles/10199575
Most view post at Blockchain => Jason Chen
Post name: 01. 什麼是 C4 CBP (Certified Bitcoin Professional) 認證
Post url: https://ithelp.ithome.com.tw/articles/10199455
Most view post at Everything on Azure => Gary
Post name: 公司目前正面臨著問題現況....
Post url: https://ithelp.ithome.com.tw/articles/10199487
Most view post at Agile => Ivan
Post name: 30天利用教練式引導建立指導新人的SOP - 第一天
Post url: https://ithelp.ithome.com.tw/articles/10199442
Most view post at Security => SunAllen
Post name: 心洞年代-1(已補充-序-)
Post url: https://ithelp.ithome.com.tw/articles/10199421
Most view post at Cloud Native => 安總裁
Post name: [Day 01] Cloud Native Startups:一個簡單的垃圾分類器與計算平台
Post url: https://ithelp.ithome.com.tw/articles/10199542
當然也可更進一步找出更多資料
就看使用者需要看甚麼資料
像是還有平均觀看人數
10/06 22:00 每個作者平均觀看人數
Jason Chen avg view =>667
seconddim avg view =>233
eugenechen avg view =>147
Teng Wang avg view =>106
Jian-Ching avg view =>147
考特源 avg view =>153
daniel0614 avg view =>95
dansnow avg view =>97
NiJia avg view =>123
poyu2099 avg view =>87
洋蔥 avg view =>78
badgameshowtw avg view =>79
小魚 avg view =>63
henry.w avg view =>153
Joseph-bug avg view =>109
杜岳華 avg view =>92
張小馬 avg view =>115
renton_hsu avg view =>128
盧卡斯 avg view =>165
谷哥 avg view =>147
ektrontek avg view =>117
Austin Ting avg view =>67
BY avg view =>232
James Wu avg view =>153
暐翰 avg view =>154
TYSON avg view =>118
dazedbear avg view =>109
kirai avg view =>97
CIAN avg view =>134
Paul.Ciou avg view =>141
majo2013 avg view =>97
Ting Ting avg view =>113
Sandy avg view =>153
hbdoy avg view =>102
賽門 avg view =>69
神Q超人 avg view =>45
Gary avg view =>164
Ivan avg view =>177
JavaCoffee avg view =>95
SunAllen avg view =>259
飛飛飛飛 avg view =>185
小學渣 avg view =>155
安總裁 avg view =>112
10/06 22:00 每個分類作者數量
Blockchain authors number => 2
Software Development authors number => 11
AI & Data authors number => 5
影片教學 authors number => 4
Modern Web authors number => 14
Everything on Azure authors number => 1
Agile authors number => 2
Security authors number => 3
Cloud Native authors number => 1
當然還有其他呈獻方法
大家可以想想看
如果有學過統計學
更能做一些有用的數據
提醒一下爬蟲就爬蟲,但要注意存取網站數量會不會造成問題
下一篇回歸主題
我會介紹一些我常用的Puppeteer函數