[day-19] U-net Training 細節 (1) - 隨機梯度下降法(SGD) - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

第 12 屆 iThome 鐵人賽

DAY 19

AI & Data

30天只學U-net系列第 19 篇

[day-19] U-net Training 細節 (1) - 隨機梯度下降法(SGD)

12th鐵人賽

disapear1997

團隊迷途羔羊

2020-10-04 21:52:07

1996 瀏覽

分享至

前言

因為我對於隨機梯度下降法真的很不熟，所以這一方面我會多著墨點文字去說明。

梯度下降法

The input images and their corresponding segmentation maps are used to train the network with the stochastic gradient descent implementation of Caffe [6].

the stochastic gradient descent implementation (隨機梯度下降法)可以分成兩個部分解釋：
(1) 梯度下降法

梯度與微分的概念很像，兩者的差別在於，微分在實數域；梯度在整數域，雖然兩者都會有小數點，但是本質上像素已經是梯度的最小元素單位，微分則可以是無限小。

(2) 隨機什麼？
錯誤
當下次更新的時候應該是梯度下降法更新的位置，但是為了避免進入局部最佳解，會將樣本位置隨機更新。
更正
今天問了老師之後，找到了一些比較文章，包含：
Batch, Mini Batch & Stochastic Gradient Descent
A Gentle Introduction to Mini-Batch Gradient Descent and How to Configure Batch Size

裡面說明了 SGD 隨機的是dataset中的sample。
min batch GD 則是隨機dataset 的 min batch。