Researchers find a way to make photos and muted videos ‘speak’ – here’s what it could mean for your privacy

A padlock against a black computer screen.
(Image credit: Pixabay)

Capturing audio from a still image may feel like something out of a sci-fi novel, but one scientist has actually devised a way to do it, with the helping hand of AI.

By creating a machine learning tool called Side Eye, a team led by professor of electrical and computer engineering and computer science at Northeastern University, Kevin Fu, can read into images to an extraordinary degree.

By applying Side Eye to a still image, they can determine the gender of a speaker in the room, where the photo was taken, and the words they spoke, according to TechXplore. They can also apply the tool to muted videos.

An AI-powered privacy nightmare?

"Imagine someone is doing a TikTok video and they mute it and dub music," Fu told the publicaton. "Have you ever been curious about what they're really saying? Was it 'Watermelon watermelon' or 'Here's my password?' Was somebody speaking behind them? You can actually pick up what is being spoken off camera."

The machine learning-powered Side Eye exploits image stabilization technology that’s universally used across almost all smartphone cameras. 

Cameras built into smartphones have springs to suspend the lens in liquid, meaning photos aren’t taken blurry or out of focus due to somebody’s shaky grip. Sensors and an electromagnet combine to push the lens in the opposite direction to whatever shakiness is being applied, to stabilize the image.

When somebody speaks near the camera lens while the photo is being taken, it creates tiny vibrations in the springs and bends the light in a subtle way. Although it would be near-impossible to extract the sonic frequency from these vibrations, this is made simple due to the rolling shutter method of photography most cameras use.

"The way cameras work today to reduce cost basically is they don't scan all pixels of an image simultaneously – they do it one row at a time," Fu added. "[That happens] hundreds of thousands of times in a single photo. What this basically means is you're able to amplify by over a thousand times how much frequency information you can get, basically the granularity of the audio."

While Side Eye itself is in a very basic form, and requires far more training data to refine and perfect, should a more advanced form of the system fall into the wrong hands, it could pose a cybersecurity nightmare for many.  

But, there are positive implications for the technology too, especially should a far more advanced form of Side Eye be used as a kind of digital evidence for those working to investigate crime. 

More from TechRadar Pro

Keumars Afifi-Sabet
Channel Editor (Technology), Live Science

Keumars Afifi-Sabet is the Technology Editor for Live Science. He has written for a variety of publications including ITPro, The Week Digital and ComputerActive. He has worked as a technology journalist for more than five years, having previously held the role of features editor with ITPro. In his previous role, he oversaw the commissioning and publishing of long form in areas including AI, cyber security, cloud computing and digital transformation.

Read more
Voice cloning
I cloned my voice in seconds using a free AI app, and we really need to talk about speech synthesis
Photograph of a hand holding a smartphone with two googly eyes
Every tap, every message – how to stop your smartphone spying on you
Apple AirPods Pro 2 in ear
Samsung and Meta are looking into earbuds with cameras, following Apple’s AirPods’ lead
Man with tin foil hat on.
The latest Apple Intelligence privacy scare is a lot of fuss about nothing, but here’s how to stop your phone using Enhanced Visual Search (if you really want to)
Hugging Snap
This AI app claims it can see what I'm looking at – which it mostly can
Honor's Deepfake Detection feature on an orange background
I hope this AI deepfake detection feature comes to more phones soon – but it needs one key upgrade to be truly useful
Latest in Pro
A person holding out their hand with a digital AI symbol.
AI is booming — but are businesses seeing real impact?
A digital representation of a lock
NYU website defaced as hacker leaks info on a million students
NHS
NHS IT supplier hit with major fine following ransomware attack
A business woman looking at AI on a transparent screen
Most businesses are now fully embracing AI - but aren't always protected against the risks
Hands on a laptop with overlaid logos representing network security
Winning the war on ransomware with multi-layer security
Protection from AI hacker attacks
Maintaining SAP’s confidentiality, integrity, and availability triad
Latest in News
An image of the Nintendo Switch 2
Nintendo Switch 2 pre-orders will start on April 2 according to Best Buy Canada
Person printing
Microsoft’s latest Windows 11 update exorcises possessed printers that spewed out pages of random characters
Pro-Ject A1.2 in black, playing a vinyl record in a hi-fi listening room
Pro-Ject's new fully-automatic turntable could be the buy of Record Store Day 2025
Intergalactic: The Heretic Prophet
Intergalactic: The Heretic Prophet reportedly won't release until after 2026, as Neil Druckmann says that staff 'are playing it at the office' right now - but I don't think I can wait that long
Screenshot from action RPG soulslike Lies of P
Lies of P Overture won't elaborate on the game's eyebrow-raising post-credits twist, and I think that's good news
Nintendo Switch 2
The Switch 2 launching with a Mario Kart game 'is very unlike Nintendo' compared to the original Switch releasing with Breath of the Wild, says former marketing leads: 'That's what's gonna make you want to buy the new hardware'