Paper_weight_decay
Our new paper Why Do We Need Weight Decay in Modern Deep Learning? is available online. Also check out our new preprint on layer-wise linear mode connectivity.
Our new paper Why Do We Need Weight Decay in Modern Deep Learning? is available online. Also check out our new preprint on layer-wise linear mode connectivity.