For the fourth reading group on the Stanford University Convolutional Neural Networks class, we went through the following slides:
- Gradient, sanity checks and parameter updates (second half)
There are different ways to decay your learning rate, which is recommended to avoid missing your global minima. Momentum is robust to saddle points but moves very quickly. There is no clear way to find the optimal hyperparameters, but they are very important to optimize.