Github Experiences 95 Incidents in 90 Days, 99.9% Uptime SLA Unmet

Github has had 95 incidents in last 90 days. That is more than 1 incident daily. Their SLA promises 99.9% uptime but they have been able to maintain less than 90% That means even if you shut down your product for 1 whole month last year, you still have more uptime than them. Reason is still not clear. But I suspect, since the code commit velocity has increased drastically in last months, The cracks are emerging. Usually with sufficient time between the incidents, dev interventions can fix things. But, at this scale of accelerated usage of product, even the smallest of cracks can become valleys, if not addressed swiftly. Any product's success would depend upon - infrastructure resilience - code quality and most importantly, Proactive foresight while making micro tech decisions. But humans can't foresee everything, we are bound to make mistakes, Which will be visible in our decisions, in our code, and in AI trained on our data. So it's not about avoiding mistakes. Maybe it's about fixing, learning and not repeating What do you think? yeah yeah just let me in🙂 #tech #coding #github #saas

  • graphical user interface, application

To view or add a comment, sign in

Explore content categories