Enterprise Logistics Director's $34K GitHub Actions Bill Cut 68% with Kubernetes

1mo

Talked to a Director of Platform Engineering at an enterprise logistics company last week. Their GitHub Actions bill was $34K/month. I asked what percentage was test execution. He didn't know. So we looked. 72% of their CI minutes were test runs. Not builds. Not linting. Not deploys. Tests. Running on GitHub-hosted runners at $0.008/minute against infrastructure that looked nothing like production. Here's the rule of thumb I keep seeing validated: If your test step takes longer than your build step, your CI tool is doing someone else's job. GitHub Actions is excellent at orchestrating builds and deployments. But it was never designed to run 2,000 integration tests across 6 microservices with real database connections, service mesh routing, and network policies. At Testkube, this is the pattern we see constantly: teams spending 60-70% of their CI budget on test execution that belongs inside the cluster, not in ephemeral runners. That Director's team moved test execution into Kubernetes. Same tests, same assertions. CI minutes dropped 68%. Tests actually hit real infrastructure. Failures meant something. Stop using your CI as a test lab. It wasn't built for it. #Kubernetes #GitHubActions #DevOps #PlatformEngineering #CICD

To view or add a comment, sign in

More Relevant Posts

Ahmad Asmar
3w
Report this post
How many commits have you made just to test if something works in the real environment? Push. Wait for the pipeline. It fails. Fix a config. Push again. Wait again. This is what happens when local dev looks nothing like production. Every fix is a commit, every commit is a 10-minute wait, and none of it is feature work. So I built a local dev platform where developers build and test on a real Kubernetes cluster that mirrors production. Same Dockerfile, same manifests, same ingress. - tilt up — see changes in 1 second instead of pushing and waiting - make ci-local — local gitlab pipeline run to catch failures before you push - Push once and it works, not 15 "fix CI" commits I wrote up how I built this. https://lnkd.in/dAQejEUU #Kubernetes #PlatformEngineering #DevOps #Tilt #GitLab

Building a Local Dev Platform with Kubernetes, Tilt and Local Gitlab Pipelines medium.com
Like Comment
To view or add a comment, sign in
Mohammad Hijjawi
3w
Report this post
I built a GitHub Action that reviews pull requests before a human has to. In most CI/CD workflows, a significant amount of time is spent reviewing pull requests that contain avoidable issues - unclear descriptions, missing tests, leftover debug code, or even risky patterns. To address this, I developed truepr, a lightweight GitHub Action that automatically analyzes pull requests and provides a structured quality assessment. It evaluates four key areas: - The code diff (for security risks, bad practices, and missing tests) - The pull request description (clarity, completeness, and intent) - The linked issue (context, reproducibility, and quality) - Contributor history (to provide additional context) Based on this, it generates: - A score from 0 to 100 - A grade (A to F) - A clear recommendation (approve, review, request changes, or flag) The goal is not to replace human review, but to reduce time spent on low-quality pull requests and help teams focus on meaningful feedback. truepr runs entirely within GitHub Actions, requires no external services or API keys, and can be set up in minutes. This is particularly useful for teams and maintainers working with high pull request volumes, where early signal and consistency in review standards are critical. I would welcome feedback from developers, maintainers, and DevOps professionals working in CI/CD environments. Repository: https://lnkd.in/eWRdxEF7 I strongly believe in automation, and that even small, focused tools can significantly reduce friction and save valuable time. #github #opensource #devops #cicd #softwareengineering
Like Comment
To view or add a comment, sign in
Prathamesh Bhongale
3d
Report this post
Your Kubernetes cluster is lying to you. And you won't find out until prod breaks. Here's a problem most platform engineers don't talk about enough: Config drift across environments. Everything looks identical — dev, staging, prod. Same Helm charts. Same GitOps repo. Same manifests. Then prod goes down. And you spend 3 hours figuring out why staging never caught it. Here's what actually happened: Someone patched a ConfigMap directly on the prod cluster with "kubectl edit" during last month's incident. Just a quick fix. "I'll raise a PR later." They didn't. Now prod is running a config that exists nowhere in Git. Your GitOps tool (ArgoCD, Flux — doesn't matter) shows everything as Synced because drift detection only works if the live state diverges from what's currently in Git. But the patch was never in Git to begin with. This is the gap nobody warns you about: - GitOps doesn't protect you from changes that never entered Git - kubectl diff only compares against what's applied, not what should exist - Multi-cluster setups multiply this problem — 5 clusters, 5 different "versions of truth" - The longer it goes undetected, the harder the blast radius when it surfaces The fix isn't just "don't use kubectl edit" — that battle is already lost in most orgs. The real fix is drift detection as a first-class concern: - Enable ArgoCD's self-heal and prune flags so live state is continuously reconciled - Run kubectl diff in your CI pipeline before every deploy, not just locally - Set up audit logging on your clusters — who ran kubectl commands, and when - Tools like Kyverno or Datree can flag live state mismatches proactively - Treat your cluster state like a database — no manual writes, ever The hardest part isn't the tooling. It's the culture shift of making "I'll fix it in Git later" completely unacceptable. Because in a fast-moving team, "later" is when prod burns. Been burned by config drift before? Drop it in the comments. #Kubernetes #DevOps #PlatformEngineering #GitOps #K8s #SRE #CloudNative
Like Comment
To view or add a comment, sign in
Venkatesh Gangavarapu
3w
Report this post
🗓️ Day 27/100 — 100 Days of AWS & DevOps Challenge Today's task: a bad commit was pushed to a shared repository. Undo it cleanly. The instinct for many engineers - especially under pressure is to reach for git reset --hard. That's the wrong tool the moment a commit has been pushed to a shared branch. Here's why. git reset rewinds the branch pointer backward, effectively deleting commits from history. Locally, that looks clean. But the remote still has those commits. Now your local master and origin/master have diverged. Git rejects your push. You force push. And now every team member whose local clone was based on those commits has a broken repository. git revert solves this correctly: $ git revert --no-commit HEAD $ git commit -m "revert games" $ git push origin master Instead of deleting the bad commit, it creates a new commit that contains the exact inverse of the bad commit's changes. The bad commit stays in history, it didn't disappear. But HEAD now points to a commit that cancels it out, and the working tree is back to the state before the bad commit was applied. No history rewriting. No force push. No broken clones. Just an auditable record that says "we made a mistake, here's the correction, and when." The --no-commit flag is important here because the task required a specific commit message - "revert games". Without it, Git auto-generates a message like Revert "some commit message". Using --no-commit stages the changes without committing, letting us then git commit -m "revert games" with full control over the message. This exact workflow is what you'd run during a production rollback and why every team's runbook should say git revert, not git reset. Full breakdown on GitHub 👇 https://lnkd.in/gVY8q4u4 #DevOps #Git #VersionControl #GitOps #100DaysOfDevOps #KodeKloud #LearningInPublic #CloudEngineering #SRE #Rollback #Infrastructure
Like Comment
To view or add a comment, sign in
Adam King
1w
Report this post
I was drinking coffee when GitHub announced stacked pull requests and I wondered whether this solves delivery problems or enables avoiding testing infrastructure #NeverEnoughCoffee. Large pull requests sit in review for days blocking feature delivery. Organisations struggle with review bottlenecks creating delays. Stacked PRs promise to solve this by breaking large changes into dependent smaller PRs. But trunk-based development requires committing to main at least daily with short-lived branches lasting hours. This requires automated testing infrastructure ensuring main stays stable. The question is whether stacked PRs fix delivery speed or enable teams to avoid building testing infrastructure. Features are delayed when code stays in branches instead of main. Customers go to competitors who ship features faster. Market opportunities are missed whilst teams manage stacked PR dependencies. Revenue is lost during delays from poor delivery speed. Engineering time is wasted managing complex branch dependencies instead of shipping features. Large PR's are symptoms of either poor work breakdown or lack of testing infrastructure enabling frequent main commits. Without automated testing, teams cannot commit to main frequently because they cannot verify changes are safe. Stacked PRs treat symptoms making large changes reviewable but do not fix root causes. Organizations need testing infrastructure enabling frequent commits not tools managing code staying out of main longer. I build testing infrastructure enabling teams to commit frequently without breaking builds. Automated testing validates changes are safe before they reach main. Proper testing infrastructure enables frequent integrations. Delivery delays cost revenue daily whilst competitors ship features faster. Contact me if delivery speed matters to your business. #AWS #DevOps #PlatformEngineering #Contractor
Like Comment
To view or add a comment, sign in
Ashish Kasaudhan
1w Edited
Report this post
⭐ Most platform engineers I know use Cursor for autocomplete. That's like using a excavator to dig a hole with a teaspoon attachment. I spent the last few weeks going deep on Cursor Agent — not the tab-complete, the actual agent mode — specifically for infrastructure and DevOps work. What I found changed how I think about the tool entirely. The agent doesn't just edit files. It: → Queries your live Kubernetes cluster before making a change → Catches open PRs that would conflict with what you're about to do → Investigates a 5xx incident across GitHub, kubectl, and your deploy history — in one conversation → Runs terraform validate, reads the error, fixes it, runs again — without you typing a command But the part nobody talks about: Out of the box, it's generic. It doesn't know your naming conventions, your module patterns, your "never touch this file" rules. Once you configure it properly — 6 files, maybe 2 hours of setup — it's a different tool entirely. I wrote the full breakdown. What MCP actually is, how the agent calls tools under the hood, every config file your team needs to replicate this, and 6 real use cases with exact prompts. If you work in platform or DevOps, this one's worth the read. Part 1 (link in the comment) and Part 2: https://lnkd.in/gpXdFjRU #DevOps #PlatformEngineering #Kubernetes #Terraform #CursorAI #AITools #SRE

Beyond Autocomplete: Cursor Agent for Platform and DevOps Engineers ashishkasaudhan.medium.com

1 Comment
Like Comment
To view or add a comment, sign in
Rahul Tiwari
2w
Report this post
GitOps changed how I think about deployments. Here's the mental model: Before GitOps: ❌ SSH into server → pull code → restart service → pray ❌ Jenkins pipeline pushes directly to cluster ❌ "Who deployed what?" — nobody knows After GitOps: ✅ Git is the single source of truth ✅ ArgoCD watches the repo and syncs automatically ✅ Every deployment is a Git commit — auditable, reversible ✅ Multi-cluster? Just point ArgoCD at different directories Key decisions I made: 1. Mono-repo for manifests (simpler than multi-repo for our scale) 2. ArgoCD for app deployments, FluxCD for infra components 3. Automated image tag updates via CI → Git commit → ArgoCD sync If you're starting with GitOps, start with ArgoCD + a single cluster. Don't over-engineer day one. Save this for later ♻️ #GitOps #ArgoCD #FluxCD #Kubernetes #DevOps #EKS #Kubernetes #AWS #CICD #PlatformEngineering #GitOps #Terraform #ArgoCD #CloudEngineering #SRE #DevSecOps #BackstageIO #InfrastructureAsCode #GitHub #Docker #DevOpsCommunity #TechCareers #LearningInPublic #BuildInPublic
Like Comment
To view or add a comment, sign in
Venkatesh Gangavarapu
3w
Report this post
🗓️ Day 28/100 — 100 Days of AWS & DevOps Challenge Today's task: a developer has in-progress work on a feature branch but one specific commit is ready and needs to go to master right now, without dragging the rest of the unfinished work along. This is exactly what git cherry-pick is for. # Find the commit hash on the feature branch $ git log feature --oneline # abc5678 Update info.txt ← this one # Switch to master and cherry-pick it $ git checkout master $ git cherry-pick abc5678 # Push $ git push origin master One commit. Surgically applied. Feature branch untouched. 1. Why not just merge the feature branch? - The feature branch has in-progress commits code that isn't tested, isn't ready, and would break things on master. git merge feature brings ALL of it over. Cherry-pick takes only what's ready. 2. When this pattern matters in production: - A critical bug fix lands on a development branch. You can't merge the whole branch, there are half-finished features alongside the fix. You cherry-pick the fix onto master and onto any active release branches. This is how security patches get backported across multiple versions in open source projects. Same concept, same tool. The command to find a commit by message when you don't have the hash handy: $ git log --all --oneline --grep="Update info.txt" Saves time when the branch has many commits and you're looking for one specific one. Full breakdown on GitHub 👇 https://lnkd.in/gVHV9qPc #DevOps #Git #VersionControl #CherryPick #GitOps #100DaysOfDevOps #KodeKloud #LearningInPublic #CloudEngineering #CICD #Hotfix
Like Comment
To view or add a comment, sign in
Tushar Chaudhari
5d
Report this post
🚀 Scaling on GitHub: From Script to Enterprise Moving from a solo project to a large-scale enterprise environment isn’t just about more code—it’s about managing complexity. When hundreds of developers contribute, "git push" isn't enough. You need system-level governance. Here are the 10 pillars of GitHub at scale: Structure: Choose between a Monorepo (unified flow) or Polyrepos (team independence). Storage: Use Git LFS to keep your repository slim by offloading large binary files. Gates: Implement Branch Protection—no code hits production without passing CI/CD. Ownership: Define a CODEOWNERS file to automate review assignments to the right experts. Governance: Use Rulesets to apply security guardrails across every repo in your organization. CI/CD: Scale your automation with Matrix Builds to test multiple OS versions simultaneously. Performance: Deploy Self-hosted Runners for faster, more secure automation pipelines. Security: Leverage Dependabot and Secret Scanning to catch vulnerabilities before they’re exploited. Analysis: Use CodeQL to treat your code as data and find deep-seated logic flaws. Roadmaps: Align your team with GitHub Projects (v2) for high-level visibility beyond just code. The goal? Shift from "writing code" to "building systems." 🛠️ #DevOps #DevSecOps #GitHub #SoftwareEngineering #CloudNative #SystemDesign #OpenSource
Like Comment
To view or add a comment, sign in
Rahat Ahsan
3w
Report this post
I deleted a resource from my cluster and Flux put it right back. That was the moment GitOps actually clicked for me. Here is what changed in how I think about infrastructure: Before GitOps, everything was manual. I applied manifests one by one with kubectl, tweaked things directly in the cluster, and had no reliable record of what was actually running or why. After GitOps, my Git repo is my cluster. Flux runs a constant reconciliation loop, checks what is in Git, and makes sure the cluster matches it exactly. Always. The implications of that are huge. ✅ Delete something by accident, Flux restores it. ✅ Merge a bad change, git revert is your rollback. ✅ Want to know what changed and when, check the Git log. ✅ Switch to a new cluster, point Flux at the same repo and it rebuilds everything. The config lives in Git, not in the cluster. That distinction sounds small. It is not. Have you made the shift to GitOps yet? What finally made it click for you? 👇 Follow me if you are building toward a DevOps career the practical way. #GitOps #Kubernetes #DevOps #FluxCD #CloudNative
Like Comment
To view or add a comment, sign in

6,128 followers

View Profile Connect

Enterprise Logistics Director's $34K GitHub Actions Bill Cut 68% with Kubernetes

More from this author

AI takes over writing: OpenAI wrote this article about the benefits of our software in 3 seconds, and I have to agree

Explore content categories

Enterprise Logistics Director's $34K GitHub Actions Bill Cut 68% with Kubernetes

More Relevant Posts

More from this author

AI takes over writing: OpenAI wrote this article about the benefits of our software in 3 seconds, and I have to agree

Explore related topics

Explore content categories