Data spills and shadow data: How exposed are you to the risks?

Security padlock and circuit board to protect data
(Image credit: Getty Images)

Fujitsu made the news earlier this year after accidentally exposing AWS keys, plaintext passwords, and client data. The incident caused serious concern when it became clear that the exposed data had been floating about the internet for almost 12 months.

This type of unintentional lapse in data control is more common than you might think. Larger organizations, in particular, tend to have more shadow data than they would like to admit to, and can often find it harder to keep tabs on all the places where their data is stored, be that a stray code repository or some cloud storage being used to transfer data between non-corporate systems.

Enterprises falling foul of these data spills can face serious consequences, with the immediate fallout often including reputational damage and loss of customer trust, as well as difficult questions from investors, stakeholders, and journalists.

Fallout aside, the impact of these exposures is just as worrying. It can provide cybercriminals with access to highly confidential or proprietary data, which they can then use to commit fraud or disrupt operations.

With the added possibility of lawsuits, regulatory fines, and financial losses, no company can afford to ignore data and cybersecurity. Yet many incident response teams (including ours) are all too familiar with cases like this - and they remind us that seemingly straightforward preventative measures are more complex to apply across larger organizations.

So, what are some common pitfalls, and how can your business avoid them?

Andy Swift

Cyber Security Assurance Technical Director at Six Degrees.

Always know where your data is

This might seem straightforward (and obvious), but all too often we encounter situations where restrictions on where data can and can't go, and guidelines on what data can be added to publicly accessible sources, are either overlooked in project discussions or processes, or, in some cases, not documented at all. This lack of clarity leaves individuals to decide for themselves what is acceptable, leading to potential security risks.

Our incident response teams have dealt with several cases where developers have used unauthorized or undocumented public-facing storage and repositories. Even when a repository is known and approved for use within the organization, it’s crucial to clearly define what is allowed to be stored there. Guidelines on what needs to be removed from a codebase before pushing it to a public repo should be well-defined and regularly enforced. Not doing so can create significant unmanaged risk. No matter how strong your cybersecurity defenses are, you can't protect data if you don't know it exists.

Addressing storage-related governance and providing clear project documentation that defines parameters for what can/cannot be stored publicly will always help, as will compulsory staff training for all employees. Layered on top of that, threat intelligence and DLP scanning services can also be used to good effect to monitor and search for evidence of public exposures both in public locations as well as your own known storage locations - these are well worth monitoring and reporting on long term.

Code repositories

I've lost track of how many incidents we've dealt with recently where a breach was caused by the public exposure of sensitive data. Several of these breaches occurred due to sensitive information being stored in publicly accessible GitHub repositories or Amazon S3 buckets.

Frequently, these stored resources contained hard-coded materials like AWS administrative keys for user accounts and user credentials. In some cases, they even included static API keys for public services that hold sensitive financial information; it's practically like signposting a target for attack and giving the attacker the keys to the door.

Using repositories has long been standard practice for developers, making it essential for enterprises to continuously review their security protocols. It's not just about expecting developers to avoid mistakes - it's about ensuring there are clear rules and automated checks in place to help prevent mistakes as well.

For instance, GitHub offers tools for secret scanning and advanced security features that can help detect and prevent sensitive data from being exposed in your codebase. By implementing these kinds of safeguards, you reduce the risk of sensitive information, such as API keys, being inadvertently published. Without these checks, even the most well-intentioned developers may unintentionally compromise security.

Keep it private

I’m not suggesting that GitHub or other code repositories are bad - far from it (they have saved my sanity far too many times for me to brand them as such!). But staying secure means organizations should endeavor to conceal their code until it’s safe to do otherwise.

With that in mind, privacy should always be the default stance until appropriate checks have been run on everything that’s being stored. And if you intend to publish publicly, make sure you complete pre- and post-publication repository checks first.

Make reporting easy

Specialist researchers and bug bounty hunters can provide valuable insights (even if you have not directly employed them), so long as you give them a clearly signposted route to tell you about anything they discover. In Fujitsu’s case, Jelle Ursem, a Netherlands-based security researcher, had tried to contact the company but found he could not easily reach the right person.

Preventing a similar situation is relatively simple: Set up a contact form on your website or use a dedicated monitored inbox to handle discoveries. You could even introduce a bug bounty program to reward ethical hackers and the wider research community for discovering and reporting vulnerabilities to you.

Either way, the key is to have a structured method for working with others to enable the reporting issues, because let's face it - it’s always preferable to finding out after a breach. By that point it’s usually far too late, and the costly damage has already been done.

The reality is that the larger and more complex your organisation, the more data it will have. Unfortunately, this makes it more likely that information will be accidentally exposed. But as with many things, planning, education, and transparent policies, as well as the use of appropriate tools for checking against human error, will help minimize the risk.

We've featured the best encryption software.

This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Andy Swift is Head of Offensive Security at Six Degrees, a leading secure cloud-led managed service provider that works as a collaborative technology partner to organizations making a digital transition.