iT邦幫忙

第 11 屆 iThome 鐵人賽

DAY 19
0

時間型特徵最常見的處理方式 - 將特徵分解為年-月-日-時-分-秒

The most commonly used way: spliting time series feature into year-month-day-hour-minute-second
https://ithelp.ithome.com.tw/upload/images/20190920/20119709w5LQncP9AS.jpg

週期循環特徵 Time Cycle

有時單純切分時間特徵所得到的欄位與目標值關係並不直接,此時我們可以思考目標值是否與一些時間週期有關係(找尋目標值與時間有沒有循環的特性)。這樣的特性可能發生於:

  1. 年週期 - 季節、溫度、雨量等
  2. 月週期 - 薪水、繳費等
  3. 周周期 - 例假日、消費習慣相關
  4. 日週期 - 生活作息相關

Sometimes the columns after splitting up the time series feature still don't have stong relationship with the target values. We can then check if there's some sort of time cycle that make the target value repeat itself. Some possible time cycle could be:

  1. Annual cycle - season, temperature, rainfall etc.
  2. Month cycle - recieving salary, paying fees etc.
  3. Week cycle - weekends, shopping habits etc.
  4. Day cycle - daily routine etc.
    https://ithelp.ithome.com.tw/upload/images/20190920/20119709T6AORHbhIF.png

前述的週期所需數值都可由時間欄位搭配正弦函數或餘弦函數加以組合:

  1. 年週期 (ex: 正-冷 / 負-熱) cos((月/12 + 日/360)2π)
  2. 周週期 (ex: 正-精神飽滿 / 負-疲倦) sin((星期幾/7 + 小時/168)2π)
  3. 日週期 (ex: 正-精神飽滿 / 負-疲倦) sin((小時/24 + 分/1440 + 秒/86400)2π)

Time cycle mentioned above could be achieved by using time series splitted columns with cos or sin functions.

  1. Annual cycle (ex: positive-cold / negative-hot) cos((month/12 + day/360)2π)
  2. Week cycle (ex: positive-full of energy / negative-tired) sin((weekday/7 + hour/168)2π)
  3. Day cycle (ex: positive-full of energy / negative-tired sin((hour/24 + minute/1440 + sec/86400)2π)

時段特徵 Time frame

也可能從時段區間的事件計數推估事件發生的機率,如:網站銷售預測、累計點擊量等。
It is also possible to predict the possibility of observing countings of an event in a certain time frame.

本篇文章請參考Github。The code is available on Github.

文中若有錯誤還望不吝指正,感激不盡。
Please let me know if there’s any mistake in this article. Thanks for reading.

Reference 參考資料:

[1] 第二屆機器學習百日馬拉松內容

[2] Time Series Anomaly Detection Algorithms


上一篇
Day18 Categorical Data 2/2 counting and feature hashing 類別型特徵 2/2 計數編碼與特徵雜湊
下一篇
Day20 Airbnb in Berlin 1/5 booking rate 柏林Airbnb 1/5 訂房率
系列文
Hands on Data Cleaning and Scraping 資料清理與爬蟲實作30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言