Nvidia's text-to-video tech could take your GIF game to the next level

Two laptops showing gifs generated by Nvidia's text-to-video models
(Image credit: Nvidia)

Now that ChatGPT and Midjourney are pretty much mainstream, the next big AI race is text-to-video generators – and Nvidia has just shown off some impressive demos of the tech that could soon take your GIFs to a new level.

A new research paper and micro-site from Nvidia's Toronto AI Lab, called "High-Resolution Video Synthesis with Latent Diffusion Models", gives us a taste of the incredible video creation tools that are about to join the ever-growing list of the best AI art generators.

Latent Diffusion Models (or LDMs) are a type of AI that can generate videos without needing massive computing power. Nvidia says its tech does this by building on the work of text-to-image generators, in this case Stable Diffusion, and adding a "temporal dimension to the latent space diffusion model".

A gif of a stormtrooper vacuuming on a beach

(Image credit: Nvidia)

In other words, its generative AI can make still images move in a realistic way and upscale them to using super-resolution techniques. This means it can produce short, 4.7-second long videos with a resolution of 1280x2048, or longer ones at the lower resolution of 512x1024 for driving videos.

Our immediate thought on seeing the early demos (like the ones above and below) is how much this could boost our GIF game. Okay, there are bigger ramifications, like the democratization of video creation and the prospect of automated film adaptations, but at this stage text-to-GIF seems to be the most exciting use case.

A teddy bear playing the electric guitar

(Image credit: Nvidia)

Simple prompts like 'a storm trooper vacuuming on the beach' and a 'teddy bear is playing the electric guitar, high definition, 4K' produce some pretty usable results, even if there are naturally artifacts and morphing with some of the creations.

Right now, that makes text-to-video tech like Nvidia's new demos most suitable for thumbnails and GIFs. But, given the rapid improvements seen in Nvidia's AI generation for longer scenes, we probably won't have to wait for longer text-to-video clips in stock libraries and beyond.


Analysis: The next frontier for generative AI 

The sun peeking through the window of a New York City loft

(Image credit: Runway)

Nvidia isn't the first company to show off an AI text-to-video generator. We recently saw Google Phenaki make its debut, revealing its potential for 20-second clips based on longer prompts. Its demos also show an albeit more ropey clip that's over two minutes long.

The startup Runway, which helped created the text-to-image generator Stable Diffusion, also revealed its Gen-2 AI video model last month. Alongside responding to prompts like 'the late afternoon sun peeking though the window of a New York City loft' (the result of which is above), it lets you provide an still image to base the generated video on and lets you request styles to be applied to its videos, too.

The latter was also a theme of the recent demos for Adobe Firefly, which showed how much easier AI is going to make video editing. In programs like Adobe Premiere Rush, you'll soon be able to type in the time of day or season you want to see in your video and Adobe's AI will do the rest.

The recent demos from Nvidia, Google, and Runway show that full text-to-video generation is in a slightly more nebulous state, often creating weird, dreamy or warped results. But, for now, that'll do nicely for our GIF game – and rapid improvements that'll make the tech suitable for longer videos (as demos from the more recent OpenAI Sora show) are now just around the corner.

TOPICS
Mark Wilson
Senior news editor

Mark is TechRadar's Senior news editor. Having worked in tech journalism for a ludicrous 17 years, Mark is now attempting to break the world record for the number of camera bags hoarded by one person. He was previously Cameras Editor at both TechRadar and Trusted Reviews, Acting editor on Stuff.tv, as well as Features editor and Reviews editor on Stuff magazine. As a freelancer, he's contributed to titles including The Sunday Times, FourFourTwo and Arena. And in a former life, he also won The Daily Telegraph's Young Sportswriter of the Year. But that was before he discovered the strange joys of getting up at 4am for a photo shoot in London's Square Mile. 

Read more
Dream Machine on a laptop.
What is Dream Machine: everything you need to know about the AI video generator
Sora-generated image
What is OpenAI's Sora? The text-to-video tool explained and how you can use it
Synthesia
What is Synthesia? Everything we know about the best enterprise AI video generator
Open AI
OpenAI unveiled image generation for 4o – here's everything you need to know about the ChatGPT upgrade
Stability AI 3D Video
Stability AI’s new virtual camera turns any image into a cool 3D video and I’m blown away by how good it is
OpenAI Sora
What is Sora? Everything we know about OpenAI's AI video generator
Latest in Artificial Intelligence
Google Gemini 2.5 and ChatGPT o3-mini
I pitted Gemini 2.5 Pro against ChatGPT o3-mini to find out which AI reasoning model is best
Opera AI Tabs
Opera's new AI feature brings order to your browser tab chaos
Apple WWDC 2025 announced
3 things Apple needs to do at WWDC 2025 to save Apple Intelligence, and why I'm convinced it will
Chat GPT-generated images along with source material
ChatGPT 4o image generation is so good we will never be able to trust iPhone renders (and photos) again
Gemini on a smartphone.
Gemini 2.5 is now available for Advanced users and it seriously improves Google’s AI reasoning
Pixel Studio on an phone
Pixel Studio on the Pixel 9 now lets you generate AI images of people, and the results can be terrifying
Latest in News
Buzz Lightyear Space Ranger Spin Rennovations
Disney’s giving a classic Buzz Lightyear ride a tech overhaul – here's everything you need to know
Hisense U8 series TV on wall in living room
Hisense announces 2025 mini-LED TV lineup, with screen sizes up to 100 inches – and a surprising smart TV switch
Nintendo Music teaser art
Nintendo Music expands its library with songs from Kirby and the Forgotten Land and Tetris
Opera AI Tabs
Opera's new AI feature brings order to your browser tab chaos
An image of Pro-Ject's Flatten it closed and opened
Pro-Ject’s new vinyl flattener will fix any warped LPs you inadvertently buy on Record Store Day
The iPhone 16 Pro on a grey background
iPhone 17 Pro tipped to get 8K video recording – but I want these 3 video features instead