Some words about secrets leaks in Git repositories
🇺🇸 – Sunday, March 6th 2022
We all know it could be quite easy to leak secrets or sensitive data in our Git repositories.
In most of cases we just acted too fast, or were not aware we added in the version control systems (VCS) such sensitive files or objects. Bad SSH configuration with private or public keys in the VCS tree, API keys defined in hard-coded variables in the source code, keystore files with credentials in the Gradle files (including alias, key and password of course), IP addresses, sensitive URL, and so on.
And when we work on public or shared repositories, we have all those sensitive data spread outside!
When people get noticed of these leaks, they may apply bad patterns to fix these issues, for example :
- Make a commit “just to remove the change”, (useless because the Git history still contains the data)
- Make the project private (bad, because users won't be able to get it)
- Delete the repository (useless if there are forks of it)
One tool can be useful, Gitleaks.
Note that Gitleaks looks both in the files tree of the project and the Git history. That's a reason why we must not make such “fix commit” because the history keeps traces of what we do and tried to hide.
So, I would like to share three useful and cool things:
- BFG Repo-Cleaner which is awesome and allows you to get rid of commited files (like keystores or SSH keys...) and keep the history clean
- This page on GitHub about removing sensitive data
- And Gitleaks to look in your repositories for leaks
The command to run Gitleaks is very simple:
Beware if you scan big repositories (like a fork or a project with an old history), Gitleaks will take long time to run.
In addition, the Git configuration value diff.renameLimit should be updated to allow Gitleaks to work.