How dark data could be your company's downfall

cybersecurity
Image Credit: Pixabay (Image credit: Image Credit: Geralt / Pixabay)

Great businesses are built on data. It's the invisible force that powers innovation, shapes decision-making, and gives companies a competitive edge. From understanding customer needs to optimizing operations, data is the key that unlocks insights into every facet of an organization.

In the past few decades, the workplace has undergone a digital transformation, with knowledge work now existing primarily in bits and bytes rather than on paper. Product designs, strategy documents, and financial analyses all live within digital files spread across numerous repositories and enterprise systems. This shift has enabled companies to access vast volumes of information to accelerate their operations and market position.

However, with this data-driven revolution comes a hidden challenge that many organizations are only beginning to grasp. As we look deeper into corporate data, organizations are uncovering a phenomenon that's as pervasive as it is misunderstood: dark data.

Gartner defines dark data as any information assets that organizations collect, process, and store during regular business activities but generally don't use for other purposes.

Nishant Doshi

Chief Product and Development Officer, Cyberhaven.

What makes dark data that insidious?

Dark data often contains a company's most sensitive intellectual property and confidential information, making it a ticking time bomb for potential security breaches and compliance violations. Unlike actively managed data, dark data lurks in the background, unprotected and often forgotten, yet still accessible to those who know where to look.

The scale of this problem is alarming: according to Gartner, up to 80% of enterprise data is “dark,” representing a vast reservoir of untapped potential and hidden risks.

Let's consider the information from annual performance reviews as an example. While official data is stored in HR software, other sensitive information is stored in various forms and across various systems: informal spreadsheets, email threads, meeting notes, draft reviews, self-assessments, and peer feedback. This scattered, often forgotten data paints a clear picture of the complex and potentially dangerous nature of dark data within organizations.

A single breach exposing this information could lead to legal liabilities and regulatory fines for mishandling personal data, damaged employee trust, potential lawsuits, competitive disadvantage if strategic plans or salary information is leaked, and reputational damage that could impact recruitment and retention.

The unintended consequences of AI

AI is changing how organizations handle dark data, bringing both opportunities and significant risks. Large language models are now capable of sifting through vast troves of unstructured data, turning previously inaccessible information into valuable insights.

These systems can analyze everything from email communications and meeting transcripts to social media posts and customer service logs. They can uncover patterns, trends, and correlations that human analysts might miss, potentially leading to improved decision-making, enhanced operational efficiency, and innovative product development.

However, this newfound ability to access data is also exposing organizations to increased security and privacy risks. As AI unearths sensitive information from forgotten corners of the digital ecosystem, it creates new vectors for data breaches and compliance violations. To make matters worse, this data that is being indexed by AI solutions is often behind permissive internal access controls. The AI solutions make this data widely available. As these systems become more adept at piecing together disparate bits of information, they may reveal insights that were never intended to be discovered or shared. This could lead to privacy infringements and potential misuse of personal information.

How to combat this growing problem

The key lies in understanding the context of your data: where it came from, who interacted with it, and how it's been used.

For instance, a seemingly innocuous spreadsheet becomes far more critical if we know it was created by the CFO, shared with the board of directors, and frequently accessed before quarterly earnings calls. This context immediately elevates the document's importance and potential sensitivity.

The way to gain this contextual understanding is through data lineage. Data lineage tracks the complete life cycle of data, including its origin, movements, and transformations. It provides a comprehensive view of how data flows through an organization, who interacts with it, and how it's used.

By implementing robust data lineage practices, organizations can understand where their most sensitive data is stored and how it is being accessed and shared: By combining AI based content inspection along with context on how it’s being accessed and shared (i.e. data lineage), organizations can quickly identify dark data and prevent it from being exfiltrated.

We've compiled a list of the best document management software.

This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Chief Product and Development Officer, Cyberhaven.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Read more
Avast cybersecurity
How to beat ‘shadow AI’ across your organization
Racks of servers inside a data center.
As the ‘age of AI’ beckons, it’s time to get serious about data resilience
IT
Need to shine a light on shadow IT? Start with process
Concept art representing cybersecurity principles
Safeguarding your digital information from cyber-attacks
A hand reaching out to touch a futuristic rendering of an AI processor.
Unlocking AI’s true potential: the power of a robust data foundation
A person holding out their hand with a digital AI symbol.
DeepSeek kicks off the next wave of the AI rush
Latest in Pro
Eurocom Raptor X18
At $15,000, this massive 256GB RAM laptop makes Apple's MacBook Pro look affordable, tiny and very, very slow
Squarespace
Build a website for less with 10% off Squarespace subscriptions
An American flag flying outside the US Capitol building against a blue sky
The FCC is creating a security council to bolster US defenses against cyberattacks
UK Prime Minister Sir Kier Starmer
UK PM says AI should soon replace civil servants
Image depicting hands typing on a keyboard, with phishing hooks holding files, passwords and credit cards.
Microsoft warns about a new phishing campaign impersonating Booking.com
Ransomware
Microsoft uncovers sleuthy new XCSSET MacOS malware campaign
Latest in News
Google Gemini Flash 2.0 Images
I tried Gemini's new AI image generation tool - here are 5 ways to get the best art from Google's Flash 2.0
An image of the Samsung Galaxy S25 Ultra from a hands-on event
Samsung Galaxy S26 Ultra could resurrect an intriguing camera feature
Eurocom Raptor X18
At $15,000, this massive 256GB RAM laptop makes Apple's MacBook Pro look affordable, tiny and very, very slow
Cristin Milioti in Black Mirror season 7
Netflix launches trailer for Black Mirror season 7, giving us a look at its first-ever sequel episode and an unexpected returning character
A graphic of the PC Gaming Show
Get ready for a bounty of PC games on June 8, as the PC Gaming show is back
A close up of The Daily podcast from Pocket Casts' web page
‘Podcasting shouldn’t be locked behind walled gardens’: Pocket Casts slams Spotify and makes its web player free to all