Getting your Trinity Audio player ready...
|
Researchers from the Massachusetts Institute of Technology (MIT) have made significant headway in their attempts to make artificial intelligence (AI) image generators run 30 times faster following experiments into a novel technique.
In their preprint paper, experts said the technique involves rolling the multi-stage processes employed by diffusion models into a single step. The method, dubbed “distribution matching distillation (DMD),” allows new AI models to mirror the capabilities of existing image generators without the hassles of undergoing a “100-stage process.”
Diffusion models like Midjourney and Stable Diffusion typically rely on a convoluted process from input to output. Most models rely on an image information creator, a decoder, and multiple steps to “remove noise,” a protracted process further complicated by the quality of the image.
DMD adopts a “teacher-student” approach to lay the foundation for leaner models to operate in the same manner as complicated AI image generators. A closer look at DMD’s operation unveils an integration of generative adversarial networks (GANs) with diffusion models, opening numerous possibilities.
The researchers point to various benefits associated with DMD, including saving computational power and time. They also noted that DMD reduced image generation from 2.59 seconds to a staggering 90 milliseconds without affecting the quality of outputs.
“Our work is a novel method that accelerates current diffusion models such as Stable Diffusion and DALLE-3 by 30 times,” said lead researcher Tianwei Yin. “This advancement not only significantly reduces computational time but also retains, if not surpassed, the quality of the generated visual content.”
DMD achieves the feat by relying on two key components—regression loss and a distribution matching loss. The first component streamlines the training process, while the distribution matching loss ensures a correlation with real-world occurrence frequency.
“Decreasing the number of interactions has been the Holy Grail in diffusion models since their inception,” said researcher Fredo Durand. “We are very excited to finally enable single-step image generation, which will dramatically reduce compute costs and accelerate the process.”
LLMs are not left out
While researchers are progressing with the diffusion models, large language models
(LLMs) and other emerging technologies enjoy their fair share of innovation. In mid-March, a group of Chinese researchers unveiled a new compression technique for LLMs to circumvent hardware limitations in their deployment.
Using a technique involving pruning unnecessary parameters, the researchers noted that users could save a fortune in inference costs without the need for new mode training. Dubbed ShortGPT, the research paper notes that this method “significantly outperforms previous state-of-the-art (SOTA) methods in model pruning.”
In order for artificial intelligence (AI) to work right within the law and thrive in the face of growing challenges, it needs to integrate an enterprise blockchain system that ensures data input quality and ownership—allowing it to keep data safe while also guaranteeing the immutability of data. Check out CoinGeek’s coverage on this emerging tech to learn more why Enterprise blockchain will be the backbone of AI.
Watch: Blockchain can bring accountability to AI