Using data and machine learning to improve your organisation’s phishing resilience and security education program

Bianca Wirth

Published Jan 14, 2021

During COVID, the 2020 Harvey Nash KPMG CIO survey found that 40% of organisations had experienced an increase in phishing and malware. Phishing is not anything new, but it does represent one of the most common security threats today and having the right information at the right time is critical in combating phishing threats before they turn into a costly ransomware event or mass data exfiltration from your organisation.

Organisational phishing resilience comes down to a couple of factors – first, timely reporting of phishing threats inside an organisation so the source can be blocked from further proliferation. Technology is not in the starring role here – people are. Your people need to understand how to recognise a phishing threat and how to report it and that requires ongoing and engaging security education and awareness that turns the dial on a risk-based culture. Technology plays the supporting role by providing a simple and easy reporting method, such as an add-in in Outlook that gives your people with a 1-click method to report phishing attempts. The key words are ‘simple and easy’ – asking people to add an attachment to another email and send it to an email address will inevitably result in phishing reporting being relegated to the 'Urgh - too hard' basket.

Second, understanding the factors involved in phishing susceptibility through both quantitative and qualitative analysis. To achieve this, you need to be gathering data on your phishing susceptibility through both phishing simulations and real phishing reports submitted by your people and analysed by your security operations team.

By collecting a large amount of this data, you can use machine learning to uncover behavioural patterns that will optimise your awareness campaigns and help you make informed changes to bring about real cultural change. It also provides you with the ability to measure and justify the impact of your awareness program and allows you to tailor reporting to different stakeholders.

There are a few different machine learning models that can be used for data analysis such as KNN classifier and regression, however for phishing decision tree analysis works well in helping to identify potential factors behind why people are susceptible to phishing in your organisation. The following model shows an example of how decision tree analysis can be used to analyse phishing susceptibility. In this simplified example, you need to collect data on whether the person was using a mobile when they clicked, and whether they are classed as a contractor. This comes down to correlating the data collected from the phishing simulation system and your HR data.

But challenges include data quality and richness, as well as the stability and trustworthiness of machine learning models. Analysing collected data and machine learning isn’t the solution by itself and this is where contextual understanding comes in. There is no substitute for both understanding the common susceptibility factors and gathering supplementary data about your people’s phishing response practices (for example, through interviews, feedback and surveys).

Ultimately people click on phishing for a broad range of reasons that are not all easily explained by only looking at data. This includes:

Our inherent cognitive biases
Relationships with people who we digitally communicate with
The design of phishing communications
Time and timing of phishing delivery
Curiosity and boredom
A person’s individual risk propensity
Their knowledge of how to recognise phishing (and what to do with it regardless of whether they have clicked or not)

Understanding the complex nature of human interactions is the first step in helping your organisation combat people-related security threats, and the data helps explain, confirm and communicate your resilience.

Acknowledgement: Thank you to Dr Yenni Tim, Senior Lecturer at University of NSW, for her collaboration with me on this article, as part of our presentation at AusCERT 2020.

Flavius Plesu 4y

Simeon M. Matt S.

Yenni Tim 5y

Great article Bianca! Always love your engaging writing!

Nicholas P. 5y

The data size does not always correlate with time periods. 2 years of data in one company might be enough and too little in another. We had our ML experiment published which was a predictor of student exam results and that used 2 iterations of data and yielded good results. You want to look at the complexity of the problem at hand and the best model for the job, some require more data than others. Whilst not ideal, you can create synthetic data also if you have enough original data to base it on also.

Jason Murrell 5y

Check this out Jonathan Horne ⚐

1 Reaction

See more comments

To view or add a comment, sign in

Using data and machine learning to improve your organisation’s phishing resilience and security education program

Bianca Wirth

More articles by Bianca Wirth

Others also viewed

The Security Awareness Formula That Cut Phishing Clicks by 76% in One Year

How to Set Up and Run a Phishing Campaign with GoPhish

Still Falling for the Bait? Why Phishing Keeps Winning

The Human Firewall: Building It Strong with Phishing Simulation Training

Performing an (unofficial) QR code phishing simulation in AST

Phishing Training Still Matters — But It Needs a Serious Upgrade

Phishing Education - Maybe 'Best Practice' is not Best After all

Phishing Is Back on Top. AI Made It Easier. And Once-a-Quarter Testing Can't Keep Up.

Cybersecurity Tip #11: How to Not Get Phished

Understanding Phishing Campaigns and C2 Threats

How to Secure Your Business Against Phishing

How to Combat Sophisticated Phishing Attacks

How to Protect Against Emerging Phishing Techniques

How to Train Employees on Phishing Risks

Understanding Phishing Attack Sophistication and Evasion Techniques

Explore content categories

More articles by Bianca Wirth

The Board, CPS 234 & APRA's Tripartite Reviews

The Cultural Change of a Crisis

The Phishing Ecosystem

100 million ways your car could be hacked

Here's how Australians lost $5.7m in scams in just one month in 2017 (and 3 simple steps to protect yourself)

Using Lynda.com to train your employees on Cyber Security

Take 10 mins to secure your LinkedIn profile today

Is Periscope a security threat to your enterprise?

Ask yourself: do we really have to go to tender?