Web stack system breach incident report (Postmortem)

May 02, 2022

The following is the incident report for the Web stack system breach that occurred on April 04, 2022.

Issue Summary

At 2:00 AM GMT, the on duty engineer suspected that the web stack might have been misbehaving than usual. And that the server is resulting in 500 error code responses whenever he wants to access the data. In addition to that, some users are failing to gain access to their accounts through the privileged user when they try to log in as a superuser. The root cause of these system failures were caused by an attacker who managed to breach in.

Timeline (all time CAT)

2:00 AM: System left vulnerable
2:10 AM: System breach takes place
2:16 AM: Pagers alerted on call engineer
2:54 AM: Failed configuration change log in as privileged user
4:15 AM: Successful configuration to privileged user
4:19 AM: Server restarts begin
4:58 AM: System back online

Root Cause

At 2:10 AM the root user was left vulnerable after one of the machines was put into hibernation mode without making sure to logout. This made the system to be running in the background with all the privileges exposed. Which made it easy for the attackers to use the authentication key to the main server. After the attacker gained full access, some authentication keys of user accounts were changed, which made it impossible for them to log into their accounts remotely. In addition, 40% of the company’s data has been erased from the database, due to the SQL injection that took place during the attack. Therefore the servers began self shutdown, to prevent the attacker from going in further.

Resolutions and recovery

At 2:16 AM CAT, the monitoring systems alerted our on duty engineer who investigated and quickly escalated the issue. The incident response team identified the breach and removed the attacker from the system together with the virus.

At 2:54 AM, we started to check every user account if the authentication configuration is still working. This attempt wasn’t very successful since some private keys were changed during the attack. The user authentication keys were renewed and successfully configured new passwords by 4:15.

To help with the recovery, we turned off some of our monitoring systems which were triggering the virus. As a result at 4:19 all the servers were rebooted one more time, and manually this time around. Some of the data was recovered, at most 60% of the lost achieves. By 4:58,we managed to get the system back online with completely new authentications.

Corrective and Preventive Measures

In the past 3 weeks, we’ve been conducting full system review and analysis for the breach. The following have been put in place to address the underlying causes of the attack and to prevent recurrence and improve response times:

Renewing existing user database schemas
Updating the firewall
Programmatically enforcing self logout from the root when the system is inactive for 35 minutes
Increase the number of white hat hackers in the organization to defend the system

The response team is committed to continually and quickly improving the system infrastructure and operational processes to prevent outages.

Web stack system breach incident report (Postmortem)

Evance Chapuma

Issue Summary

Timeline (all time CAT)

Root Cause

Resolutions and recovery

Corrective and Preventive Measures

More articles by Evance Chapuma

Explore content categories

Issue Summary

Timeline (all time CAT)

Root Cause

Resolutions and recovery

Corrective and Preventive Measures

More articles by Evance Chapuma

Simplifying Font Installation for Creatives

Exploring the World of Distributed Systems Development

How to fix GitHub's 'support for password authentication was removed' error

Hello Crk Space 1.0 🎉

What happens when you type "google.com" in your browser and press enter

The UNIX_COMMAND ls -l *.c

Top Machine Learning Libraries Today

Explore content categories