[day-06] 第3段 introduction - 從過去的解決方案出發，利用新的方法解決問題 - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

第 12 屆 iThome 鐵人賽

DAY 6

AI & Data

30天只學U-net系列第 6 篇

[day-06] 第3段 introduction - 從過去的解決方案出發，利用新的方法解決問題

12th鐵人賽

disapear1997

團隊迷途羔羊

2020-09-21 00:22:38

1436 瀏覽

分享至

前言

提出前人的問題(蠻困難的，不論是觀察出來跟說出來)

Unet 想要解決的問題

Obviously, the strategy in Ciresan et al. [1] has two drawbacks.

從別人的文章中看到兩個缺點。

First, it is quite slow because the network must be run separately for each patch, and there is a lot of redundancy due to overlapping patches.

很慢
資訊很多重疊(每個pixel都被重複利用25次，window = 5X5時候)

Secondly, there is a trade-off between localization accuracy and the use of context.

window大小與精確度的權衡。

Larger patches require more max-pooling layers that reduce the localization accuracy, while small patches allow the network to see only little context.

太大的patches(windows)會導致需要更多的max pooling並且減少準確度。
太小的patches又沒有意義。
問題：windows 大小要設多少？

More recent approaches [11,4] proposed a classifier output that takes into account the features from multiple layers.

最近的一種分類器，考慮了多層的輸出，未看先猜可能是densenet 或者是 resnet。

Good localization and the use of context are possible at the same time.

所以可能可以同時考慮周圍的pixel又能提升準確度。

文獻閱讀

[4] Hypercolumns for Object Segmentation and Fine-grained Localization

這篇文章提出了一個問題，一般的CNN最後一層就只有語意資訊，但是將每一層convolution 後的feature做 upsample(有點類似反卷積)，可以萃取不同尺度的特徵。

[11] Image segmentation with cascaded hierarchical models and logistic disjunctive normal networks

這一張不明所以的圖，簡單來說就是萃取不同尺度的特徵與upsample組成，沒一個階段都有分類的成果，分類的成果又能當作分類的特徵，中間又有Max pooling。