MWC 2024: MediaTek shows off on-device AI image generator that works as fast as you type

The next generation of AI-backed text-to-image generators has arrived, with MediaTek leading the charge. At the ongoing Mobile World Congress (MWC 2024) in Barcelona, Spain, the Taiwan-based semiconductor company showcased an on-device AI image generator capable of generating pictures as fast as you type. The technology was showcased on a prototype smartphone featuring MediaTek’s flagship 4nm-based Dimensity 9300 SoC. The company hopes its future-generation chipsets will enable similar AI capabilities, making the technology available to the masses.

How does MediaTek’s on-device AI image generator work?

If you’re familiar with popular text-to-image generators, the functioning is more or less the same. Tools, including Microsoft Copilot (formerly Bing AI), Mindjourney, Dall-E and Adobe Firefly, let users create elaborate images based on simple text inputs. The images are as good as your instructions, and the generating process generally takes up to a minute or two. Also, there’s no way of knowing how the AI image generator reads inputs until the final output arrives.

On the other hand, MediaTek demonstrated a real-time image generation process. It means you can observe the tool’s output as you type. For example, typing “Spider-Man” immediately prompts the tool to generate corresponding images, giving you a preview of the direction. Furthermore, the final output is generated instantly upon completion of typing.

In other words, you can instantly see whether the tool understands the prompt, whether it’s a person, a place or an object.

How is MediaTek achieving this?

As mentioned, MediaTek showcased the technology on a Dimensity 9300 SoC-powered prototype device. This particular chipset utilises a new Accelerated Processing Unit (APU) architecture to incorporate an upgraded generative AI engine for faster and more reliable processing.

mediatek

MediaTek is also leveraging Stable Diffusion XL Turbo text-to-image generator open source to let users create images through a vast catalogue. The company claims that this is the first implementation of the Stable Diffusion XL Turbo on a mobile device.

What does the future look like?

During our test, the MediaTek Dimensity 9300 SoC-powered smartphone generated images with simple text inputs within seconds. The results were equally impressive.

I also tested the tool with text inputs specific to India, and the results were exceptional. Of course, some results were not 100 per cent accurate, but mind you, this was on a prototype device. That means, the future looks brighter and promising, at least for dynamic image generators.

There’s still some time till smartphone OEMs fully leverage Dimensity 9300 SoC’s improved APU to unlock native AI capabilities. The Vivo X100 is the only smartphone in India to feature this Dimensity chipset. However, the smartphone is yet to offer a native text-to-image generator.