I turned my dog into a plushie using AI and it was super easy
Google Whisk blends images without any need for complex prompts
AI image generators can do some impressive things, but they are often limited by your own ability to explain your vision in words for a prompt. Even when the AI can translate your words into the image in your head in some ways, getting the right mix of characters, location, and style all in one image can be difficult.
DALL-E or other tools are able to create images based on pictures you upload, but even then, it can be tough to get the right mix. That's what makes the new Google Whisk experiment so interesting.
Using Google Gemini and the Imagen 3 image creation model, Whisk can create entirely new images by blending existing ones. Whisk skips the hassle of descriptive poetry by taking images assigned as either subject, scene, or style and combining them appropriately. Should you prefer not to hunt down the right image for one or more of those facets, you can describe it and see what Google makes of it before creating the final form.
For example, I was able to take a picture of my dog and ask to see it as a plushie, an enamel pin, and a sticker, and then get the results below.
How to Whisk
Whisk is available on Google Labs, though only in the U.S. for now. Once you’re in, the interface is refreshingly simple. You’ve got three slots to upload an image, write a prompt that Google will expand on, or ask for a random image from Google's library. You pick the subject or subjects for the image, meaning it's not just limited to one and could be a person, animal, or object. Then, you choose the scene, the backdrop, or the location you want. Finally, you select the style, which can be literally any form of art or, as with the plushie, even a crafted object.
Each image has a text description written by Gemini that you can change up if you think it got it wrong. Or, if it's a generated image, you can play around with the description to get something else. You then can put in more details for the final image, for instance, having my dog balancing on a ball with a funny hat on.
With those in place, Whisk generates two image that doesn’t just combine your inputs, it interprets them. This isn’t Photoshop layering; it’s full-on AI remix culture.
Get daily insight, inspiration and deals in your inbox
Sign up for breaking news, reviews, opinion, top tech deals, and more.
Whisk is at its best when you lean into the unexpected and fun. Whisk thrives on experimentation, which means half the fun is watching how it interprets your wildly mismatched inputs. Sometimes, it gets it right; sometimes, you’re left with something gloriously weird. Either way, it’s a win.
For example, the first image below started with a picture of a pocket watch, a library, and a gothic painting. The second used a photo of a punk rocker, an old alley photo from New York City, and a written description of a classic old comic book art. The third took a photo of a bear in the wild, a photo of an old diner, and an illustration from a children's book. The results speak for themselves.
Whisked Away
While Whisk is intuitive, a few tricks can help you get the most out of it. Using high-quality images greatly helps, especially if you want to get the subject close to the original character or object. The AI does its best work when it knows what it’s looking at.
Also, think outside the box. You never know what these combinations will lead to. And if it's not working as you want, it's much easier to upload new photos of who or whatever you want the AI to play with. Lastly, you can always tweak the underlying captions and inputs for more fine-tuned results.
Not needing meticulously written prompts will likely make Whisk far more attractive to the average person. That said, it will probably face more pushback from creators whose work was used to train the AI models behind it.
Still, if you struggle to put your creative vision into words, an AI image creator that focuses on visuals instead of vocabulary might be your new favorite toy, even if it's just to see what you would look like as a plushie of yourself.
You might also like
Eric Hal Schwartz is a freelance writer for TechRadar with more than 15 years of experience covering the intersection of the world and technology. For the last five years, he served as head writer for Voicebot.ai and was on the leading edge of reporting on generative AI and large language models. He's since become an expert on the products of generative AI models, such as OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and every other synthetic media tool. His experience runs the gamut of media, including print, digital, broadcast, and live events. Now, he's continuing to tell the stories people want and need to hear about the rapidly evolving AI space and its impact on their lives. Eric is based in New York City.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.