• AnAmericanPotato@programming.dev
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    4 days ago

    Yep. AGI is still science fiction. Anyone telling you otherwise is probably just trying to fool investors. Ignore anyone who is less than three degrees of separation away from a marketing department.

    The low-hanging fruit is quickly getting picked, so we’re bound to see a slowdown in advancement. And that’s a good thing. We don’t really need better language models at this point; we need better applications that use them.

    The limiting factor is not so much hardware as it is our knowledge and competence in software architecture. As a historical example, 10 short years ago, computers were nowhere near top-level at Go. Then DeepMind developed AlphaGo, which was a huge leap forward and could beat a top pro. It ran on a supercomputer cluster. Thanks to the research breakthroughs around AlphaGo, within a few years had similar AI that could run on any smartphone and could beat any human player. It’s not because consumer hardware got that much faster; it’s because we learned how to make better software. Modern Go engines are a fraction of the size of AlphaGo, and generate similar or better quality results with a tiny fraction of the operations. And it seems like we’re pretty close to the limit now. A supercomputer can’t play all that much better than my laptop.

    Similarly, a few years ago something like ChatGPT 3 needed a supercomputer. Now you can run a model with similar performance on a high-end phone, or a low-end laptop. Again, it’s not because hardware has improved; the difference is the software. My current laptop (2021 model) is older than ChatGPT 3 (publicly launched in 2022) and it can easily run superior models.

    But the returns inevitably diminish. There’s a limit somewhere. It’s hard to say exactly where, but entropy’s gonna getcha sooner or later. You simply cannot fit more than 16GB of information in a 16GB model; you can only inch closer to that theoretical limit, and specialize into smaller scopes. At some point the world will realize that trying to encode everything into a model is a dumb idea. We already have better tools for that.