How to Remove Sensitive Data from Your GitHub Repository Safely and Completely

How to Remove Sensitive Data from Your GitHub Repository Safely and Completely

Accidentally committing an API key, password, or credential to a Git repository happens more often than we’d like to admit. Unfortunately, simply deleting the secret in a later commit doesn't remove it. Git keeps the full history. If that secret ends up on GitHub, it could be cached in pull requests, forks, or cloned copies.

In this post, we’ll walk through how to fully remove sensitive data from a GitHub repository, using git-filter-repo, coordinating with collaborators, and implementing practices to prevent future incidents.

Step 1: Revoke or Rotate the Leaked Secret

Before you touch the repository’s history, immediately revoke or rotate the compromised secret. This minimizes the risk in case the secret has already been accessed.

Example actions:

  • Revoke the GitHub personal access token
  • Rotate AWS access keys
  • Change any exposed database credentials

Step 2: Understand the Risks of Rewriting History

Rewriting Git history has significant consequences:

  • Changed commit hashes that can break CI/CD pipelines or linked PRs
  • Lost GPG commit/tag signatures
  • Invalidated or missing PR diffs
  • The possibility of reintroducing secrets via old local clones
  • Forks still containing the sensitive data
  • Requires force pushing and disabling branch protection temporarily

Coordinate with all collaborators—each one must re-clone or clean their local repositories to avoid reintroducing the sensitive history.

Step 3: Use git-filter-repo to Remove Sensitive Files or Content

GitHub recommends git-filter-repo as the modern replacement for filter-branch and BFG Repo-Cleaner.

Install git-filter-repo

brew install git-filter-repo # macOS # or follow installation instructions from https://github.com/newren/git-filter-repo        

Clone the repository

git clone https://github.com/YOUR-ORG/YOUR-REPO.git cd YOUR-REPO        

Delete a specific file from all history

git filter-repo --sensitive-data-removal --invert-paths --path path/to/secret.txt        

If the file was renamed or moved at some point, repeat this command for each historical path.

Replace specific sensitive strings

Create a file (e.g., passwords.txt) containing:

my-secret-password==>REMOVED
hardcoded-api-key==>REMOVED        

Then run:

git filter-repo --sensitive-data-removal --replace-text ../passwords.txt        

Step 4: Verify and Force Push to GitHub

Check which pull requests will be affected:

grep -c '^refs/pull/.*/head$' .git/filter-repo/changed-refs        

Push your cleaned repository:

git push --force --mirror origin        

You may need to disable branch protection rules temporarily for this push to succeed. GitHub will block changes to refs/pull/ Those will be handled in the next step.

Step 5: Fully Remove the Data from GitHub Servers

GitHub may retain cached references to the sensitive data in closed pull requests or forks.

Open a GitHub Support ticket and include:

  • Your repository name
  • The number of affected pull requests (from the previous step)
  • The "First Changed Commit" output from git-filter-repo
  • Mention any orphaned LFS objects, if applicable

GitHub Support can:

  • Remove cached PR views
  • Dereference old pull requests
  • Run garbage collection
  • Purge LFS objects (if needed)

Note: GitHub will only help if the data is sensitive and cannot be safely mitigated by rotating credentials.

Step 6: Coordinate with Collaborators

Every developer with a clone of the repository before the cleanup must either:

  • Re-clone the repository OR
  • Follow the cleanup guide in the git-filter-repo manual

If they don’t, they risk reintroducing the sensitive data into the cleaned history.

Step 7: Prevent Future Leaks

Accidents happen, but you can reduce the risk:

  • Add secrets, environment files, and local config files to .gitignore
  • Use environment variables or secret managers like:
  • Avoid hardcoding secrets in source code
  • Use tools like:
  • Set up pre-commit hooks to scan for secrets
  • Review staged changes with git diff --cached before committing
  • Enable GitHub Push Protection and Secret Scanning

Summary

Leaking secrets into a Git repository isn’t the end of the world, but it does require swift and careful action. Here’s the process in short:

  1. Revoke the leaked secret immediately.
  2. Use git-filter-repo to rewrite history and remove the data.
  3. Force-push to GitHub and disable protections if needed.
  4. Work with GitHub Support to purge cached references.
  5. Instruct collaborators to re-clone or clean their local repositories.
  6. Put processes in place to prevent future incidents.

References

Excellent, crucial guide! Accidentally committing secrets to Git is a nightmare, and as you rightly point out, simply deleting isn't enough. Walking through revoking, git-filter-repo, and coordinating with the team are absolutely essential steps for truly cleaning up. This is vital reading for any developer working with repositories.

Like
Reply

Thanks for sharing, Renato

Like
Reply
Like
Reply

To view or add a comment, sign in

More articles by Renato Nascimento

Others also viewed

Explore content categories