iT邦幫忙

第 11 屆 iThome 鐵人賽

DAY 18
0
AI & Data

跟top kaggler學習如何贏得資料分析競賽 系列 第 18

[Day 18] Metrics optimization / 評估指標最佳化 - Classification

  • 分享至 

  • xImage
  •  

Loss function 損失函數的作用是算預測值f(x)跟真實值y差異程度的表現

Logloss 對數損失函數/交叉熵損失

.Tree-based

XGBoost, LightGBM

.Linear models

sklearn.<>Regression 
sklearn.SGDRegressor 
Vowpal Wabbit (quantile loss)

.Neural nets

Pytorch, Keras, TF, etc

Accuracy --> 適合任何評估

https://ithelp.ithome.com.tw/upload/images/20190919/20108719ZCzRPDR4G8.png

https://ithelp.ithome.com.tw/upload/images/20190919/20108719kWx4t0y4Rm.png
截圖自 Coursera


AUC (ROC)

原理
https://ithelp.ithome.com.tw/upload/images/20190919/201087193rXUbbo7vX.png
截圖自 Coursera

.Tree-based

XGBoost, LightGBM

.Neural nets

Pytorch, Keras, TF - not out of the box

Quadratic weighted Kappa

Quadratic Weighted Kappa 與常見的統計量(例如精準度 或 MSE/RMSE)不同, Quadratic Weighted Kappa 衡量正確答案與預估間一致程度 經過平方加權 會加倍懲罰差距過遠的預估

  1. MSE + Thresholds
  2. Smooth loss

soft kappa xgboost 語法連結 https://eyusuwbavdctmvzkdnmwro.coursera-apps.org/notebooks/readonly/reading_materials/Metrics_video8_soft_kappa_xgboost.ipynb


補充資料連結

Classification

. Evaluation Metrics for Classification Problems: Quick Examples + References http://queirozf.com/entries/evaluation-metrics-for-classification-quick-examples-references

. Decision Trees: “Gini” vs. “Entropy” criteria https://www.garysieling.com/blog/sklearn-gini-vs-entropy-criteria

. Understanding ROC curveshttp://www.navan.name/roc/


Ranking

. Learning to Rank using Gradient Descent -- original paper about pairwise method for AUC optimization http://icml.cc/2015/wp-content/uploads/2015/06/icml_ranking.pdf

. Overview of further developments of RankNet https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/MSR-TR-2010-82.pdf

. RankLib (implemtations for the 2 papers from above)https://sourceforge.net/p/lemur/wiki/RankLib/

. Learning to Rank Overview https://wellecks.wordpress.com/2015/01/15/learning-to-rank-overview


Cluster

. Evaluation metrics for clustering http://nlp.uned.es/docs/amigo2007a.pdf


上一篇
[Day 17] Metrics optimization / 評估指標最佳化 - Regression
下一篇
[Day 19] HW/ SW
系列文
跟top kaggler學習如何贏得資料分析競賽 30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言