Paper Review: "An Empirical Exploration of Recurrent Network Architectures"

Sep 18, 2019

lstm paper-review

paper link

Key Points

set forget bias to 1 when training LSTM layers to get GRU comparable results
in language models, lstm is better than gru