Thank you, Christoph, for this very good explanation.
In particular, I love how you make a concerted effort to also give the intuition behind the points you presented. All too often, it's just "well, the math says X" - but many people have a hard time understanding what X really means, and then what it's implications are, and so on.
Thank you, Christoph, for this very good explanation.
In particular, I love how you make a concerted effort to also give the intuition behind the points you presented. All too often, it's just "well, the math says X" - but many people have a hard time understanding what X really means, and then what it's implications are, and so on.
I love this connection between compression and prediction, it gives another lens to understand how these models work.
Shamless plug but I covered the Gzip + Knn paper and Language Modeling is Compression papers in two of my articles.
[1]: https://codeconfessions.substack.com/p/decoding-the-acl-paper-gzip-and-knn
[2]: https://codeconfessions.substack.com/p/language-modeling-is-compression
Thanks for sharing. Perfect deep dives for readers of this post.