Yes that's what I had in mind. Interesting that they can get decent convergence ...

ssivark · on June 7, 2023

Right — that’s certainly surprising & intriguing. So they discuss this in section 4, and offer a theoretical argument why the rate of convergence might be independent of the (large) number of parameters. I haven’t grokked that yet, but maybe one could think of this as a consequence of the shape of the landscape (cost function) in the overparametrized regime.