“Stacked Hourglass Networks for Human Pose Estimation” paper review

paper link submitted in 2016 I’m only interested in the stacked hourglass architecture, not about pose segmentation performance. So the points listed below are only related to stacked hourglass architecture. “encoding-decoding” or “conv-deconv” structure is already introduced. This paper goes one step further and stacks muiltiple “hourglass” structure. While stacking, Read more…

FCN, UNet, FPN comparison

The three all seem to have “downsampling and then upsampling” idea at the core. But what are the differences? Which one is the correct one to coin when referencing the “downsampling + upsampling” idea? Fully Convolutional Network(FCN) submitted: 2014.11.14https://arxiv.org/pdf/1411.4038.pdf remove any dense(fully connected) layers and only use convolution layers. downsampling Read more…