A new paper suggests diminishing returns from larger and larger generative AI models. Dr Mike Pound discusses.

The Paper (No “Zero-Shot” Without Exponential Data): https://arxiv.org/abs/2404.04125

  • CheesyFox@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 months ago

    under the “reinventing computers” i mean chosing another information transfering entity for our processing units. For instance, photonics is a perspective field, as photons are much smaller, thus potentially we could make even smaller logical elements also as they produce much less heat.

    What’s about ML architechture, of course it won’t be the tech bros, of course it would be scientists, but don’t forget that untill someone sponsors them, the research could take literal decades before there will be discovered anything revolutional. Scientists are not some kind of gurus who live in moutains and fed by the energy of the sun. In order to make a living they have jobs besides scientific research. That’s why grants and other research funding methods do exist. And as you could’ve guessed, these are greatly dependant on guys with money and their interest in said researchi.

    • Lvxferre@mander.xyz
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 months ago

      Not even another info transferring entity would solve it. Be it quantum computers, photonic computers, at the end of the day we’d be simply brute forcing the problem harder, due to increased processing power. But we need something else than brute force due to the diminishing returns.

      Just to give you an idea. A human needs around 2400kcal/day to survive, or 100kcal/h = 116W. Only 20% of that is taken by the brain, so ~23W. (I bet that most of that is used for motor control, not reasoning.) We clearly suck as computing machines, and yet our output is considerably better than the junk yielded by LLMs and diffusion models, even if you use a really nice computer and let the model take its time producing its [babble | six fingers “art”]. Those models are clearly doing lots of unnecessary operations, while failing hard at what they’re expected to do.

      Regarding research, my point is that what’s going to fix generative models is likely from outside the field of artificial intelligence. It’ll be likely something small and barely related, that happens to have some ML application.