Google Gemini will soon let you edit those AI-generated images to fix the 3-eyed dogs and impossible buildings
Fine-tuning will help Gemini images climb out of the uncanny valley
Artificial intelligence can produce impressive images, but it isn't uncommon for these images to have weird problems, such as people with too many teeth or cityscapes with Escher-style street layouts. Google Gemini is working on upgrading its AI image creation feature to fix those sorts of problems, as first spotted in unfinished code by Android Authority. It appears a fine-tuning capability is on its way, which will allow users to make detailed edits to their AI-generated images.
Google Gemini's text-to-image tools can't make edits after creating the image right now. Instead, users have to submit new prompts, hoping the new prompt will fix any problems and create something that matches what they want to see. That can be especially tedious if there's only a small but still distracting error. According to the uncovered code, Gemini's fine-tuning feature will address the need for limited changes with two editing methods.
The first option will let users submit a prompt about an AI-generated image and ask for a change to one aspect. For instance, if you liked the image above but wanted to set it in a city, you could keep the robot and bird but change the background by asking Gemini to move them. The second method described in the code is a more interactive approach. Users could circle the part of the image they want to change using their finger or a stylus. Once the area is selected, they can describe the desired changes, and Gemini will understand that the instructions pertain only to the circled section.
AI Editing Success
These editing tools could particularly benefit those in fields such as graphic design, marketing, and social media, where visual accuracy and quick turnaround times are crucial. Google Gemini can better serve the needs of artists, designers, and casual users who seek to create polished visual content more efficiently. While the exact release date of these features remains uncertain, their appearance in the code suggests it won't be long coming. It also pairs well with related features like the upcoming Ask Photos image search feature.
Google won't be the first to deploy editing tools to AI image makers. These methods are largely the same as those available with OpenAI's Dall-E portfolio of AI image-making models. In ChatGPT, users can ask for adjustments to an already produced image, or they can highlight parts of it and submit a new text prompt adjusting that part of the picture. There are similar features for many AI image creators like Ideogram.ai and Adobe Firefly. Still, Google's plan to incorporate these fine-tuning tools is a technical jump for Gemini. It marks Google's ongoing push to match and surpass its rivals at OpenAI, Meta, and elsewhere when it comes to generative AI tools.
You might also like
- I tried Google's text-to-image AI, and I was shocked by the results
- Google Photos will soon fix your average videos with a single ‘enhance’ tap
- Google is about to start scrolling through all your pictures for its 'Ask Photos' feature
Get daily insight, inspiration and deals in your inbox
Sign up for breaking news, reviews, opinion, top tech deals, and more.
Eric Hal Schwartz is a freelance writer for TechRadar with more than 15 years of experience covering the intersection of the world and technology. For the last five years, he served as head writer for Voicebot.ai and was on the leading edge of reporting on generative AI and large language models. He's since become an expert on the products of generative AI models, such as OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and every other synthetic media tool. His experience runs the gamut of media, including print, digital, broadcast, and live events. Now, he's continuing to tell the stories people want and need to hear about the rapidly evolving AI space and its impact on their lives. Eric is based in New York City.