I tried Google's text-to-image AI, and I was shocked by the results

Google Imagen
(Image credit: Shutterstock / metamorworks / Google / Imagen)

Text-to-image artificial intelligence programs aren’t anything new. Indeed, existing neural networks like DALL-E have impressed us with their ability to generate simple, photorealistic images from brief yet descriptive sentences.

But this week I was introduced to Imagen. Developed by Google Research’s Brain Team, Imagen is an AI similar to that of DALL-E and LDM. However, Brain Team’s aim with Imagen is to generate images with a greater level of accuracy and fidelity, using that same short and descriptive sentence method to create them.

An example of such sentences would be – as per demonstrations on the Imagen website – “A photo of a fuzzy panda wearing a cowboy hat and black leather jacket riding a bike on top of a mountain.” That’s quite a mouthful, but the sentence is structured in such a way that the AI can identify each item as its own criteria.

The AI then analyzes each segment of the sentence as a digestible chunk of information and attempts to produce an image as closely related to that sentence as possible. And barring some uncanniness or oddities here and there, Imagen can do this with surprisingly quick and accurate results.

Oil painting of a cat on a skateboard

Imagen can draw better than me. (Image credit: Google / Imagen)

A little too wholesome?

If you’ve checked out Imagen or other neural networks for yourself, then you’ve probably noticed the overwhelming focus on a select few subjects. DALL-E, for example, likes to create images based on everyday household items, like clocks or toilets. Imagen, at least for now, seems to put cute animals at the forefront of its image generation capabilities. But there’s actually a very good reason for this.

Google’s Brain Team doesn’t shy away from the fact that Imagen is keeping things relatively harmless. As part of a rather lengthy disclaimer, the team is well aware that neural networks can be used to generate harmful content like racial stereotypes or push toxic ideologies. Imagen even makes use of a dataset that’s known to contain such inappropriate content.

“While a subset of our training data was filtered to remove noise and undesirable content, such as pornographic imagery and toxic language,” Brain Team notes, “we also utilized LAION-400M dataset which is known to contain a wide range of inappropriate content including pornographic imagery, racist slurs, and harmful social stereotypes. 

“Imagen relies on text encoders trained on uncurated web-scale data, and thus inherits the social biases and limitations of large language models.”

Cat playing a guitar

This was one of the less uncanny photos I was able to generate with Imagen. (Image credit: Google / Imagen)

This is also the reason why Google’s Brain Team has no plans to release Imagen for public use, at least until it can develop further ‘safeguards’ to prevent the AI from being used for nefarious purposes. As a result, the preview on the website is limited to just a few handpicked variables.

Ultimately, it’s the right call. There have been examples in the past of AI programs being unleashed onto the online public… with extremely undesirable results. You may remember Microsoft’s Tay, an AI Twitter account brought to the social media platform roughly five years ago.

Tay was a pretty ballsy experiment on Microsoft’s part. Its intention was to see how an AI would react to and interact with real people in a social media environment. However, within hours, Tay went from a wholesome chatbot to a dispenser of anti-semitic talking points. This was despite the bot being “modeled, cleaned and filtered” according to Microsoft (thanks, The Verge).

Given the precedent set by AI like Tay, then, it’s easy to see why Imagen has been reigned in. Clearly, even extensive filtering might not be enough.

Still far from perfect

While I was immensely impressed by Imagen, and had a lot of fun mixing and matching sentences to create all kinds of bizarre pictures, it’s definitely not something I’d consider to be overwhelmingly convincing. At least not for the time being.

More often than not, Imagen returned some frighteningly hilarious results. Animals, in particular, often appeared with all kinds of wacky proportions. Seeing a raccoon with a massive head, or human-like girthy arms gripping a bike’s handlebars was a pretty common sight. While very funny, these peculiar results blended with the photorealism often churned out disturbingly uncanny results.

The option to generate an oil painting was actually a good deal more convincing, and most of what Imagen was able to produce here wouldn’t look out of place in a school project. And I mean that in the nicest possible way. As it turns out, a Persian cat strumming a guitar translates far more convincingly to a painting than it does a realistic photo.

As noted, it’s highly likely we won’t get a public release of Imagen anytime soon. Or ever, for that matter. The risks posed by AI programs and neural networks being able to generate unsavory content are still far too great. For now, though, I’m content with Imagen being a fun little curio for those looking to spend a bit of time generating funny cowboy hat-wearing animals skateboarding down a mountain.

TOPICS
Rhys Wood
Hardware Editor

Rhys is TRG's Hardware Editor, and has been part of the TechRadar team for more than two years. Particularly passionate about high-quality third-party controllers and headsets, as well as the latest and greatest in fight sticks and VR, Rhys strives to provide easy-to-read, informative coverage on gaming hardware of all kinds. As for the games themselves, Rhys is especially keen on fighting and racing games, as well as soulslikes and RPGs.

Read more
An image created by Google's Imagen3 artificial intelligence image generator.
What is Imagen 3: everything you need to know about Google's text-to-image model
Google Whisk
I turned my dog into a plushie using AI and it was super easy
A promotional image for Google Whisk, an experimental AI image generator
Google Whisk is a new way to create AI visuals using image prompts – here's how to try it
DeepDream
What is DeepDream? Everything we know about the AI image tool
DALL-E 3
I tried ChatGPT's Dall-E 3 image generator and these 5 tips will help you get the most from your AI creations
An AI-generated image of the city of London in the future, created by Dall-E 3 using ChatGPT
What is DALL-E 3: everything you need to know about the AI image generator
Latest in Artificial Intelligence
Google Gemini Robotics
Gemini just got physical and you should prepare for a robot revolution
ChatGPT Parenting
I use ChatGPT to help with parenting - here's 5 prompts you can use AI to keep the kids entertained
Gemini on a smartphone.
I used Gemini AI to declutter my Gmail inbox and saved myself 5 hours a week – here’s how you can do the same
AI writing
ChatGPT just wrote the most beautiful short story, and I wonder what I'm even doing here
ChatGPT
ChatGPT wants to write your next novel, and readers and writers alike should be very worried
Apple products with Apple Intelligence against a white background
Apple rushed Apple Intelligence and now the company is stuck playing catch up
Latest in News
Google Gemini Robotics
Gemini just got physical and you should prepare for a robot revolution
Lilo & Stitch Official Trailer
Stitch crashes into earth and steals our hearts with the first trailer for the live-action Lilo & Stitch
GTA 5
GTA Online publisher Take-Two is gunning for a black market that’s basically heaven for cheaters
Y2K cast looking shocked
Y2K has a streaming release date on Max, so you can witness the technology uprising at home
The Discovery+ homepage
Discovery+ just got a big update to its streaming app that makes it more like Max – here are 5 great new features to try
Two Android phones on a green and blue background showing Google Messages
Struggling with slow Google Messages photo transfers? Google says new update will make 'noticeable difference'