the shape of intelligence

I think when you look inside today’s AI models, especially LLMs, it’s intuitively hard to say whether there’s something deeper going on than just very sophisticated matmuls created by smart people. A few years ago, when models were much less capable, people were saying things like “these are just glorified autocompletes” or “stochastic parrots.” That framing made a lot of sense back then, but after I came across this paper, it doesn’t really feel sufficient anymore.

As I understand it, the Platonic Representation Hypothesis refers to an empirical observation that different models, even with different architectures and trained on different datasets, often end up learning very similar internal representations, usually ones that can be related to each other by fairly simple transformations. There seems to be this convergence happening at the representation level. That’s really interesting to me, because it suggests these models aren’t just memorizing surface patterns, but are settling into some kind of shared internal structure.

This starts to feel loosely reminiscent of things people talk about in neuroscience, where different brains seem to develop similar representational structures for similar tasks. Doesn't mean that LLMs and brains are the same thing, or that we’ve “solved intelligence,” but it does give the impression that there might be some underlying structure, maybe a kind of undiscovered manifold that is not comprehensible to humans, that large learning systems tend to discover when they’re pushed hard enough. If that’s true, then it’s not crazy to imagine that future AI systems and biological brains could end up looking even more similar at this representational level, even if they’re built out of totally different stuff.