Momentum, Accumulates past gradients to damp oscillations, AdaGrad, Per-parameter learning rate based on summed squared gradients, FGSM, One-step adversarial attack using the sign of the input gradient, Differential Evolution, Mutation by adding a scaled difference of two population members, Grokking, Sudden generalization long after memorizing the training set, Rosenbrock, Benchmark with a narrow curved valley, easy to enter, hard to traverse, Neuroevolution, Training network weights with an evolutionary algorithm, no backprop.

Stochastic Methods in Machine Learning

Leaderboard

Visual style

Options

Switch template

Continue editing: ?