Tiny Transformer trained for addition learns bizarre addition algorithm
https://www.alignmentforum.org/posts/N6WM6hs7RQMKDhYjB/a-mechanistic-interpretability-analysis-of-grokking
#ReadItLater
Tiny Transformer trained for addition learns bizarre addition algorithm
https://www.alignmentforum.org/posts/N6WM6hs7RQMKDhYjB/a-mechanistic-interpretability-analysis-of-grokking
#ReadItLater