Github Experiences 95 Incidents in 90 Days, 99.9% Uptime SLA Unmet

3w Edited

Github has had 95 incidents in last 90 days. That is more than 1 incident daily. Their SLA promises 99.9% uptime but they have been able to maintain less than 90% That means even if you shut down your product for 1 whole month last year, you still have more uptime than them. Reason is still not clear. But I suspect, since the code commit velocity has increased drastically in last months, The cracks are emerging. Usually with sufficient time between the incidents, dev interventions can fix things. But, at this scale of accelerated usage of product, even the smallest of cracks can become valleys, if not addressed swiftly. Any product's success would depend upon - infrastructure resilience - code quality and most importantly, Proactive foresight while making micro tech decisions. But humans can't foresee everything, we are bound to make mistakes, Which will be visible in our decisions, in our code, and in AI trained on our data. So it's not about avoiding mistakes. Maybe it's about fixing, learning and not repeating What do you think? yeah yeah just let me in🙂 #tech #coding #github #saas

To view or add a comment, sign in

More Relevant Posts

Taimir Alain Morales ROche
1w
Report this post
I honestly thought platforms of GitHub's scale would have solved the core scaling riddles by now. Turns out, even they are battling fundamental architectural demons. The article dives into GitHub's recent availability woes. We're talking cascading failures, tight coupling, and an inability to shed load effectively. Imagine your critical CI/CD pipelines suddenly grinding to a halt because an authentication database choked. That's the kind of disruption they're talking about. This isn't just another post-mortem. It's a stark reminder that rapid growth amplifies every architectural shortcut. I realized that what we often chalk up to "bad luck" in our own systems is often a predictable outcome of underlying structural choices. GitHub's candidness about "tight coupling" and "insufficient isolation" really hit home. In the digital financial services, I've seen how a seemingly minor dependency, like a third-party fraud detection API, can ripple through an entire transaction flow if not properly isolated and rate-limited. We often talk about microservices as the panacea, but if they're still tightly coupled at the data or authentication layer, you haven't solved the problem; you've just distributed it. Their challenge with effective "backpressure mechanisms" and load shedding, especially as AI-driven tooling piles on, highlights that just throwing more compute at it isn't enough. You need smart circuit breakers and adaptive traffic management, not just reactive scaling. The article also points out the disparity between official status pages and real-world developer experience, which resonates deeply with anyone who's ever debugged a "green" system. How do you proactively build resilience against these kinds of systemic failures when your system is growing at warp speed? What's your secret sauce for true architectural isolation? https://lnkd.in/e3EmWfjz #SystemArchitecture #Scalability #ReliabilityEngineering #DevOps
Like Comment
To view or add a comment, sign in
Uzair shekhani
3w
Report this post
Everyone jokes about rm -rf *… until it actually happens. A while back, GitHub engineer accidentally accidentally ran a destructive command on the wrong repository. Not a fork. Not a personal project. The company’s main GitHub repo. Within seconds… pipelines failed. Services broke. Data disappeared. Panic kicked in. And this wasn’t a small startup. This was at the scale where even minutes of downtime matter. But here’s the part no one talks about 👇 The system came back. Why? Because great engineering isn’t about never making mistakes. It’s about designing systems that survive mistakes. -> GitHub backups saved them -> Branch protections prevented even worse disasters -> Teams jumped in and fixed things fast Within hours, everything was restored. 💡 The lesson? If you’ve ever broken something in code, accidentally deleted a branch, or messed up production… You’re not alone. Even the best engineers have done it. The difference isn’t perfection. The difference is how fast you recover and what you learn. So next time you make a mistake… Don’t panic. Improve your system. Because in tech, mistakes are not the end. They’re part of the process. #github #programming #softwareengineering #devlife #learning #growth #tech
6 Comments
Like Comment
To view or add a comment, sign in
Can Y.
2d
Report this post
GitHub's update on availability is a compelling read. They're scaling for an incredible 30x growth by 2026, primarily driven by the explosion of "agentic development workflows." This really highlights how AI-driven dev is already pushing infrastructure to its limits. Good to hear their detailed plans for reliability and transparency after recent incidents. 🚀 #GitHub #AIDev
Like Comment
To view or add a comment, sign in
Jonathan Cardoso
2d
Report this post
🚨 Is GitHub's reliability hurting your team? I've been talking with many customers recently, and a common theme keeps coming up — frustration with GitHub's service health. Outages, degraded performance, and uncertainty around uptime are slowing teams down. If that sounds familiar, there's a path forward. In 3 days, I'll be running a free workshop walking through how to migrate from GitHub to GitLab — step by step, no guesswork. You'll leave with a clear migration plan, practical tips, and confidence to make the switch. 👉 Interested? Join us here: https://lnkd.in/d-ckV-9G Quinten Dismukes, Colin Stevenson, Thiago Magro, Adrian Tigert #GitLab #GitHub #DevOps #Migration #Workshop
2 Comments
Like Comment
To view or add a comment, sign in
NoShip

68 followers
1w
Report this post
Easter Sunday is not the time to find out your deploy pipeline is still open. We've all seen it. A PR gets merged late Friday "just to get it in." By Sunday someone's getting paged. The on-call engineer is not happy. NoShip lets you set a recurring freeze that kicks in automatically every holiday weekend. Define the window once, and GitHub enforces it. No Slack reminders. No honor system. No "I thought someone else handled it." Set it. Forget it. Enjoy the long weekend. #DevOps #GitHub #CodeFreeze #SRE #PlatformEngineering #DeploymentSafety #Easter
Like Comment
To view or add a comment, sign in
Chandan Kumar
5d
Report this post
🚨 90% of developers don’t know this… #GitHub is great, but it’s not always the best choice. If you want better privacy, self-hosting, or more flexible DevOps features check out these powerful alternatives 👇 🔗 GitLab: https://gitlab.com 🔗 Bitbucket: https://bitbucket.org 🔗 Gitea: https://gitea.io 🔗 SourceHut: https://sourcehut.org #GitHub #Developers #OpenSource #DevTools #Coding
Like Comment
To view or add a comment, sign in
Raihane Modhaffer
4w
Report this post
A lot of developers rely on GitHub every single day, but the moment you ask them how it truly differs from GitLab, the answers often get blurry. And honestly, I understand why, on la surface they look similar, yet they don’t serve the same vision at all. GitHub has become the place where the world writes code together. Backed by Microsoft and fueled by a massive open-source community, it’s built for speed, simplicity, and collaboration. Actions, Codespaces, Dependabot… everything is designed to help teams move quickly and stay focused on building. GitLab, on the other hand, follows a completely different philosophy. It’s not just a code platform, it’s a full DevSecOps environment. CI/CD is built-in, security tools are native, governance is centralized, and you can even self-host it with the open-source edition. Many companies choose it because they want one platform to manage everything from planning to deployment. So the question isn’t really “which one is better?”. It’s more like “which vision matches the way you work?”. One focuses on velocity and massive adoption. The other focuses on deep integration and full end-to-end control. If you’ve used either platform in your projects, I’d really love to hear your experience. What actually makes a difference in your daily workflow? And what would you pick again if you had to start from scratch? Your insights will definitely help others who are still trying to choose the right tool. #GitHub #GitLab #DevOps #DevSecOps
1 Comment
Like Comment
To view or add a comment, sign in
Raihane Modhaffer
1w
Report this post
A lot of developers rely on GitHub every single day, but the moment you ask them how it truly differs from GitLab, the answers often get blurry. And honestly, I understand why, on la surface they look similar, yet they don’t serve the same vision at all. GitHub has become the place where the world writes code together. Backed by Microsoft and fueled by a massive open-source community, it’s built for speed, simplicity, and collaboration. Actions, Codespaces, Dependabot… everything is designed to help teams move quickly and stay focused on building. GitLab, on the other hand, follows a completely different philosophy. It’s not just a code platform, it’s a full DevSecOps environment. CI/CD is built-in, security tools are native, governance is centralized, and you can even self-host it with the open-source edition. Many companies choose it because they want one platform to manage everything from planning to deployment. So the question isn’t really “which one is better?”. It’s more like “which vision matches the way you work?”. One focuses on velocity and massive adoption. The other focuses on deep integration and full end-to-end control. If you’ve used either platform in your projects, I’d really love to hear your experience. What actually makes a difference in your daily workflow? And what would you pick again if you had to start from scratch? Your insights will definitely help others who are still trying to choose the right tool. #GitHub #GitLab #DevOps #DevSecOps
3 Comments
Like Comment
To view or add a comment, sign in
Athreya Sharma Josyula
1mo Edited
Report this post
6 months ago, deploying our app felt like defusing a bomb. Someone had to be online. Someone had to watch the logs. And if something broke at 2am... well, good luck. I got tired of it. So I rebuilt the entire deployment process from scratch using GitHub Actions. Here's what I actually did: First thing — I forced every commit to link to a JIRA ticket. Sounds annoying, but after one incident where nobody knew what broke what, it became non-negotiable. No ticket reference = PR blocked. Second — our linting was scanning the entire codebase on every PR. Slow as hell. I changed it to only check the files that actually changed. Same quality, way faster feedback. Third — we were using :latest tags everywhere. Which meant rollbacks were basically guesswork. I switched to tagging every image with the git SHA + a timestamp. Now you can roll back to any exact version in seconds. The last part is what I'm most proud of. Most pipelines just deploy and hope. Mine waits for the pod to actually start, then reads the application logs to confirm the app is running clean. No errors, no exceptions. If something looks wrong — it automatically rolls back, cleans up the broken image from ECR, and fires off an email to the team. Zero humans needed in the middle of the night. Results after we shipped this: - Deployments got 50% faster - Latency dropped 60% - We've been at 99.9% uptime since - I haven't manually rolled back a single deployment The thing I didn't expect? The team started deploying more often. When people aren't scared of breaking things, they ship more. If you're still doing manual deployments — seriously, start with just the rollback part. That alone will change how your team feels about releasing. #DevOps #AWS #Kubernetes #GitHubActions #Docker #CICD #CloudComputing #Terraform #Python #SoftwareEngineering #DevOpsEngineer #CloudEngineer #TechCareers #ilumina ilumina Health
Like Comment
To view or add a comment, sign in
Syed Umair Ali
1w
Report this post
GitHub has quietly suspended new individual subscriptions for Pro, Pro+, and Student plans around April 19 2026. On the surface, it looks like a simple capacity decision. In reality, it exposes a deeper shift in how modern developer tools are being used and stressed. The root cause is not just growth. It is behavior. Agentic workflows, where developers rely on AI to iteratively generate, test, and refine code, are consuming far more compute than traditional usage patterns ever did. What used to be a few API calls is now continuous interaction with high end models, often running in loops. That kind of demand compounds fast. This pressure is not isolated to GitHub. It ties directly into broader infrastructure constraints, especially on Microsoft Azure, where a large portion of this compute is provisioned. When capacity tightens at that level, product decisions upstream start to change quickly. New users cannot subscribe to higher tier plans for now. Free trials were already paused due to abuse. Usage limits are becoming stricter, with token based throttling and session caps becoming more visible to end users. We are entering a phase where demand for AI assisted development is outpacing the supply of compute needed to support it. That gap forces trade offs. Limits, pricing shifts, and access controls are not edge cases anymore. They are becoming the norm. If you rely heavily on these tools, it is time to think a step ahead. Assume constraints will tighten, not loosen, and plan your workflows accordingly. #GitHub #AI #DeveloperTools #SoftwareEngineering #CloudComputing #Azure #AIDevelopment #DevOps #TechNews #GenerativeAI #LLMs #BuildInPublic #Productivity #Coding #FutureOfWork #ScalableSystems #Engineering
Like Comment
To view or add a comment, sign in

3,265 followers

75 Posts

View Profile Follow

Github Experiences 95 Incidents in 90 Days, 99.9% Uptime SLA Unmet

More Relevant Posts

Explore content categories