paper summary: “Aggregated Residual Transformations for Deep Neural Networks” (ResNext Paper)

key point compared to resnet, the residual blocks are upgraded to have multiple “paths” or as the paper puts it “cardinality” which can be treated as another model architecture design hyper parameter. resnext architectures that have sufficient cardinality shows improved performance tldr: use improved residual blocks compared to resnet Different Read more…

paper review: “Fast DenseNet: Towards Efficient and Accurate Text Recognition with Fast Dense Networks”

https://arxiv.org/ftp/arxiv/papers/1912/1912.07016.pdf TLDR: key points this paper proposes to use densent+CTC for text recognition, and in that process propose some modifications to original densenet which are 1) new block, called Fast Dense Block(FDB) 2) FDenseNet-U: fast densenet + upsampling block 3) use convolution layer with stride 2 instead of maxpooling 4) Read more…

paper review: “super convergence: very fast training of neural networks using large learning rates”

https://arxiv.org/pdf/1708.07120.pdf key idea in one sentence get min/max learning rate from LR range test, and run one cycle of cyclical learning rate at the start of training to efficiently train networks to find super convergence. Notes This paper uses a simplification of the second order Hessian-Free optimization to estimate optimal Read more…

EfficientDet paper review

paper link: https://arxiv.org/pdf/1911.09070.pdf BiFPN multiple bifpn layers for scaling use depth-wise convolution layers bidirectional cross-scale connections + weighted feature fusion. weighted feature fusion different weight for each resolution features learnable weights summarize that there are three different approaches for doing weighted feature fusion unbounded fusion: because it is unbounded, can Read more…