New top story on Hacker News: No More Adam: Learning Rate Scaling at Initialization Is All You Need
No More Adam: Learning Rate Scaling at Initialization Is All You Need
14 by jinqueeny | 0 comments on Hacker News.
14 by jinqueeny | 0 comments on Hacker News.
Comments
Post a Comment