Meta's new AI model tags and tracks every object in your videos

Meta AI SAM 2
(Image credit: Meta)

Meta has a new AI model that can label and follow any object in a video as it moves around. The Segment Anything Model 2 (SAM 2) extends the capabilities of its predecessor, SAM, which was limited to images, opening up new opportunities for video editing and analysis. 

SAM 2’s real-time segmentation is a potentially huge technical leap. It showcases how AI can process moving images and distinguish among the elements on screen even as they move around or out of the frame and back in again. 

Segmentation is the term for how software determines which pixels in an image belong to which objects. An AI assistant that can do so makes it a lot easier to process or edit complicated images. That was the breakthrough of Meta’s original SAM. SAM has helped segment sonar images of coral reefs, parsed satellite images to aid disaster relief efforts, and even analyzed cellular images to detect skin cancer. 

SAM 2 widens the video capacity, which is no small feat and would not have been feasible until very recently. As part of SAM 2’s debut, Meta shared a database of 50,000 videos created to train the model. That’s on top of the 100,000 other videos Meta mentioned employing. Along with all the training data, real-time video segmentation takes a significant amount of computing power, so while SAM 2 is open and free at the moment, it likely won’t stay that way forever. 

Meta SAM 2

(Image credit: Meta)

Segment Success

Using SAM 2, video editors could isolate and manipulate objects within a scene more easily than the limited abilities of current editing software and far beyond manually adjusting each frame. Meta envisions SAM 2 revolutionizing interactive video, too. Users could select and manipulate objects within live videos or virtual spaces thanks to the AI model. 

Meta thinks SAM 2 could also play a crucial role in the development and training of computer vision systems, particularly in autonomous vehicles. Accurate and efficient object tracking is essential for these systems to interpret and navigate their environments safely. SAM 2’s capabilities could expedite the annotation process of visual data, providing high-quality training data for these AI systems.

A lot of the AI video hype is around generating videos from text prompts. Models like OpenAI’s Sora, Runway, and Google Veo get a lot of attention for a reason. Still, the kind of editing ability provided by SAM 2 might play an even bigger role in embedding AI in video creation. 

And, while Meta might have an edge now, other AI video developers are keen on producing their own version. For instance, Google’s recent research has led to video summarization and object recognition features that it is testing on YouTube. Adobe and its Firefly AI tools are also centered on photo and video editing and include content-aware fill and auto-reframe features. 

You might also like...

Eric Hal Schwartz
Contributor

Eric Hal Schwartz is a freelance writer for TechRadar with more than 15 years of experience covering the intersection of the world and technology. For the last five years, he served as head writer for Voicebot.ai and was on the leading edge of reporting on generative AI and large language models. He's since become an expert on the products of generative AI models, such as OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and every other synthetic media tool. His experience runs the gamut of media, including print, digital, broadcast, and live events. Now, he's continuing to tell the stories people want and need to hear about the rapidly evolving AI space and its impact on their lives. Eric is based in New York City.

Read more
YouTube Veo 2
Look out, AI video could soon flood YouTube Shorts
Luma Labs Ray 2
This AI video generator can make a banana typing look realistic – and might challenge Sora
Pika AI 2.1
Sora rival Pika just dropped a new video AI model, and I can’t believe how good it is
A scientist looking through a microscope generated by Google Veo 2
Google’s new Veo 2 beats OpenAI Sora with 4K AI video generation – here’s how to try it
Sora-generated image
What is OpenAI's Sora? The text-to-video tool explained and how you can use it
OmniHuman
TikTok owner ByteDance has a new AI video creator you have to see to believe
Latest in Artificial Intelligence
A super close up image of the Google Gemini app in the Play Store
It's official: Google Assistant will be retired for phones this year, with Gemini taking over
Super Mario Odyssey
ChatGPT is the ultimate gaming tool - here's 4 ways you can use AI to help with your next playthrough
Apple CEO Tim Cook delivers remarks before the start of an Apple event at Apple headquarters on September 09, 2024 in Cupertino, California. Apple held an event to showcase the new iPhone 16, Airpods and Apple Watch models. (Photo by Justin Sullivan/Getty Images)
The big Siri Apple Intelligence delay proves that maybe we really don't know Apple at all
AI writer
Coding AI tells developer to write it himself
Apple iPhone 16 Pro Max REVIEW
Apple Intelligence is a fever dream that I bet Apple wishes we could all forget about
DeepSeek on an iPhone
OpenAI calls on US government to ban DeepSeek, calling it ‘state-subsidized’ and ‘state-controlled’
Latest in News
A super close up image of the Google Gemini app in the Play Store
It's official: Google Assistant will be retired for phones this year, with Gemini taking over
Quordle on a smartphone held in a hand
Quordle hints and answers for Sunday, March 16 (game #1147)
NYT Strands homescreen on a mobile phone screen, on a light blue background
NYT Strands hints and answers for Sunday, March 16 (game #378)
NYT Connections homescreen on a phone, on a purple background
NYT Connections hints and answers for Sunday, March 16 (game #644)
Three iPhone 16 handsets on show
Apple could launch an iPhone 17 Ultra this year – but we've heard these rumors before
Super Mario Odyssey
ChatGPT is the ultimate gaming tool - here's 4 ways you can use AI to help with your next playthrough