MIT scientists have just figured out how to make the most popular AI image generators 30 times faster

floofloof@lemmy.ca · 6 months ago

MIT scientists have just figured out how to make the most popular AI image generators 30 times faster

BetaDoggo_@lemmy.world · 6 months ago

Partnered with Adobe research so we’re never going to get the actual model.

LazaroFilm@lemmy.world · 6 months ago

And if we do it will be bundled in an Ai suite with some license checking software that crawls your whole computer to a crawl for no reason.

CrayonRosary@lemmy.world · 6 months ago

Crawls to a crawl is a very common phrase, I don’t know why people are saying it’s not.

HeChomk@lemmy.world · 6 months ago

Because it isn’t? Slows to a crawl is correct. “Crawls to a crawl” means nothing.

CrayonRosary@lemmy.world · 6 months ago

I was being sarcastic. No one else had said anything about it. 😄

Muffi@programming.dev · 6 months ago

Adobe is the fucking worst. That immediately killed my hype.

snooggums@midwest.social · 6 months ago

More mediocre images for everyone!

BrianTheeBiscuiteer@lemmy.world · 6 months ago

While I think the realism of some models is fantastic and the flexibility of others is great it is starting to feel like we’re reaching a plateau on quality. Most of the white papers I’ve seen posted lately are about speed or some alternate way of doing what ControlNet or inpainting can already do.

Björn Tantau@swg-empire.de · 6 months ago

Well, when it’s fast enough you can do it in real time. How about making old games look like they looked to you as a child?

UlrikHD@programming.dev · edit-2 6 months ago

There’s way more to a game’s look than textures though. Arguably ray tracing will have a greater impact than textures. Not to mention, for retro games, you could just generate the textures beforehand, no need to do it in real time.

Björn Tantau@swg-empire.de · 6 months ago

I meant putting the whole image through AI. Not just the textures. Tell it how you want it to look and suddenly a grizzled old Mario is jumping on a realistic turtle with blood splattering everywhere.

snooggums@midwest.social · 6 months ago

When the output of something is the average of the inputs it will naturally be mediocre. It will always look like the output of a committee by the nature of how it is formed.

Certain artists stand out because they are different from everyone else, and that is why they are celebrated. M.C. Escher has a certain style that when run through AI looks like a skilled high school student doing their best impression of M.C. Escher.

Now as a tool to inspire, AI is pretty good at creating mashups of multiple things really fast. Those could be used by an actual artist to create something engaging. Most AI reminds me of photoshop battles.

AggressivelyPassive@feddit.de · 6 months ago

That’s maybe because we’ve reached the limits of what the current architecture of models can achieve on the current architecture of GPUs.

To create significantly better models without having a fundamentally new approach, you have to increase the model size. And if all accelerators accessible to you only offer, say, 24gb, you can’t grow infinitely. At least not within a reasonable timeframe.

Kbin_space_program@kbin.social · edit-2 6 months ago

Will increasing the model actually help? Right now we’re dealing with LLMs that literally have the entire internet as a model. It is difficult to increase that.

Making a better way to process said model would be a much more substantive achievement. So that when particular details are needed it’s not just random chance that it gets it right.

AggressivelyPassive@feddit.de · 6 months ago

That is literally a complete misinterpretation of how models work.

You don’t “have the Internet as a model”, you train a model using large amounts of data. That does not mean, that this model contains any of the actual data. State of the at models are somewhere in the billions of parameters. If you have, say, 50b parameters, each being a 64bit/8 byte double (which is way, way too much accuracy) you get something like 400gb of data. That’s a lot, but the Internet slightly larger than that.

Kbin_space_program@kbin.social · edit-2 6 months ago

It’s an exaggeration, but its not far off given that Google literally has all of the web parsed at least once a day.

Reddit just sold off AI harvesting rights on all of its content to Google.

The problem is no longer model size. The problem is interpretation.

You can ask almost everyone on earth a simple deterministic math problem and you’ll get the right answer almost all of the time because they understand the principles behind it.

Until you can show deterministic understanding in AI, you have a glorified chat bot.

AggressivelyPassive@feddit.de · 6 months ago

It is far off. It’s like saying you have the entire knowledge of all physics because you skimmed a textbook once.

Interpretation is also a problem that can be solved, current models do understand quite a lot of nuance, subtext and implicit context.

But you’re moving the goal post here. We started at “don’t get better, at a plateau” and now you’re aiming for perfection.

Kbin_space_program@kbin.social · 6 months ago

You’re building beautiful straw men. They’re lies, but great job.

I said originally that we need to improve the interpretation of the model by AI, not just have even bigger models that will invariably have the same flaw as they do now.

Deterministic reliability is the end goal of that.

Aopen@discuss.tchncs.de · 6 months ago

Please link original article and paper when posting

Article: https://news.mit.edu/2024/ai-generates-high-quality-images-30-times-faster-single-step-0321

Paper: https://arxiv.org/abs/2311.18828

Thann@lemmy.ml · 6 months ago

Pfft, I can do that, just run them on a computer that’s 30 times faster!

jas0n@lemmy.world · 6 months ago

Found the software engineer.

cheeselover@lemmy.world · 6 months ago

So can I run this without cloud?

Black616Angel@feddit.de · edit-2 6 months ago

You can do that already.

Although I mainly use InvokeAI

Archr@lemmy.world · edit-2 6 months ago

I see your InvokeAI and raise you Stability Matrix

Edit: I wanted to edit my comment to leave some context for people.

Stability Matrix is an app that handles installing many different stable diffusion applications. (no more messing with InvokeAI’s janky install script).

It also integrates with CivitAI and HuggingFace to directly download models and Lora and share them between your applications, saving you lots of diskspace.

Black616Angel@feddit.de · edit-2 6 months ago

… Thanks. This looks super useful.

Edit: After posting I realized.that this sounds super sarcastic, which it wasn’t. This does look useful and I was already looking for smth. like that.

Cyyy@lemmy.world · 6 months ago

but that is just normal Stable Diffusion, not the method used or mentioned here. So it isn’t even what this news is about :/

drspod@lemmy.ml · 6 months ago

The paper: https://arxiv.org/pdf/2311.18828.pdf

stoy@lemmy.zip · 6 months ago

So more AI porn?

Steve@startrek.website · 6 months ago

Faster AI porn

assassinatedbyCIA@lemmy.world · 6 months ago

Gotta jack fast

istanbullu@lemmy.ml · 6 months ago

This is nice, but the post ignores all the other research in this topic. SDXL Lightning can generate images in 2 steps.

AnUnusualRelic@lemmy.world · 6 months ago

1st step, describe the picture.
2nd step generate the picture.

cornshark@lemmy.world · edit-2 6 months ago

1st step: draw some circles

2nd step: draw the rest of the fucking owl

werefreeatlast@lemmy.world · 6 months ago

Now they can make a movie!

LunarVoyager@lemmy.world · 6 months ago

deleted by creator