As you develop applications, are you doing everything you can do to protect your data, code, and intellectual property? If you’re not running a secret scanner as part of your CI/CD pipeline then the answer is no. You may be employing an excellent security stack but eventually something will leak without the crucial piece of secret scanning.
It may be an API key that falls in the wrong hands, a set of credentials, encryption keys, or even a URL that is being protected by obfuscation. Secrets will leak, and the smallest secret can escalate to a full-blown data breach. But where exactly do these secrets in code like to hide?
All secrets are important, even the ones that seem arbitrary. Even the smallest secret leaking can allow attackers to slowly escalate their access to the system, finding deeper and deeper secrets as they go.
In 2021 an ethical team of hackers researched the security of the Indian government. When they found a .env file in their git repository, they escalated their access using commonly found tools, until they had full access.
This was luckily done by an ethical group, but stackoverflow was not so lucky. In April 2019, a hacker managed to find a single secret exposed in stack overflow’s network, a secret that later allowed them to gain access to their entire codebase. StackOverflow managed to retain its credibility as a platform through good communication and accepting responsibility, but the damage was done.
Humans are not good at scanning code and through no fault of their own will eventually miss one secret or another from being exposed. Only a secret scanning solution can prevent your data from being leaked.
Configuration files are the category of files most prone to store insecure secrets due to the ease of doing so and the illusion of security. Ideally, all secrets should be stored as environment variables on the machine that requires them.
Developers often take a shortcut and place secrets in configuration files, feeling that since it is a step away from hardcoding the secrets in the code it will be more secure. But configuration files, especially ones uploaded to cloud repositories are far from secure.
Unit 42 scanned more than 24,000 files for secrets and found that 17% of secrets reside in configuration files.
Django is a high-level Python-based web development framework aimed at making web applications quickly, easily, and without sacrificing quality.
The ease with which developers get a platform up and running may cause them to overlook safe practices. This explains why Django configuration files were the most prominent in the data, more than doubling the second in line.
Environment configuration file, usually a .env file, is used to store environment variables until they are loaded into the machine’s environment variables.
Since environment variables are a great place to store secrets, developers often make the mistake of placing secrets into a .env file, which then might be exposed.
PHP is a very popular language for web development, especially with WordPress being run on it. The language has many different configuration files that control different settings of a PHP server.
It is hard to say exactly where the secrets are being stored and leaked from. I would venture that those secrets stored in PHP configuration files can be credited to its popularity. PHP configuration is secure when it is solely server-side, so if you do not push it to the cloud, it should remain secret.
Shell configuration files are used to initialize a shell command line, it is often used to initialize variables.
It is no surprise that those variables are often used by developers to store secrets to be easily accessed from any shell commands. This is a secure method of storing those secrets as long as you don’t upload them to the cloud.
The Ruby On Rails database configuration file is specifically designed to hold all the information required to access the database. This access information is a secret and you must keep it private. Accidentally uploading this file to the cloud will expose all your data.
NPM configuration is not a secure location for secrets, but NPM does support a secure encrypted configuration file. You should use secret-config when putting any secrets in your configuration files.
Similar to general shell configuration files, profile configuration files do not run on every initialization but only when you log in. For security purposes, you should treat them in the same manner and not upload them to the cloud.
Shell command aliases are command shorthands that can be run via the shell command line. While it is easy to see how this shortcut can be desirable for quick logins, it is a poor location for secrets. Never use a shell command alias on production, and even in testing, I wouldn’t say it is advisable.
The global git configuration file located in users/[username]/.gitconfig is a place where credentials may be stored when credential.helper store is used.
This is an overall safe location for credentials as long as they remain privately on your local machine. Since this file is located in a very personal folder it should not ever find its way to the cloud, but here we are with 113 of them.
SSH is an encryption method, and if someone is using an encryption method then it would be safe to assume they are at least somewhat security-aware. However, the fact that SSH configuration files find themselves on the cloud says otherwise. SSH configuration files must be kept private.
Just as misconfiguration and hardcoded secrets stem from human error, so would secrets leaking through code review. The only secure guaranteed way to prevent secrets from leaking is by implementing an automated approach.
You may be tempted to produce your in-house secret scanner or use an open-source solution. However, those solutions will quickly prove to be more of a bother than a boon. Even without considering false negatives (missing secrets completely), the volume of false positives would be disruptive to any effective workflow.
Secrets are often high-entropy strings, but not all high-entropy strings are secrets. So without a robust detection system, you’d need a human to sift through the alerts. And in the face of too many false positives, humans become lazy and start missing real secrets, bringing us back to human error.
SpectralOps is easy to set up and plugging it into your existing CI/CD pipeline. It supports and seamlessly integrates with all the top CI/CI platforms such as Jenkins and Azure. Once connected, SpectralOps will scan your code and configuration file commits for any leaked secrets. This is done using a high volume of specialized detectors combined with machine learning models.
While SpectralOps is highly autonomous, it gives your organization control over detectors, allowing you to add your own and enhance detection. Alerts are fired in near real-time and are customizable to make sure your team makes the best use of them.