A Machine Learning Primer for Auditors

David Poisson

Published Feb 11, 2020

If you’re an internal auditor and not yet familiar with machine learning (ML), then you’re missing out. ML is likely being piloted or even used at your organization, so you should be familiar with it as it likely influences your organization’s internal control structure. More importantly perhaps, ML is a tool that can help you be a better internal auditor.

Visual Risk IQ helps finance and audit teams get up both the learning curve and the doing curve with data analytics and visual reporting. This post explains some of basic concepts relating to ML, breaks down its major categories and offers potential uses for ML analytics for your audit team. In our next post, we’ll dive more deeply into a specific application of machine learning.

First, a warning: This post is a bit jargon- and concept-heavy. If you are only interested in applications of machine learning, then check out our future posts. If you’re an aspiring data nerd and sometimes feel lost in all the vocabulary around machine learning, then you’re in the right place.

What is Machine Learning vs Artificial Intelligence?

Machine learning is a subset of artificial intelligence. If we think of artificial intelligence as machines “thinking” the way humans think, then we can think of machine learning as machines drawing conclusions from information in the same way humans do, by taking in information, identifying relationships and patterns in the data, and developing a model of how it thinks the world works. Machine learning happens when a machine produces a predictive model. If you’ve watched any dystopian science-fiction movies (e.g. War Games, Minority Report, Eagle Eye) then you the importance of testing these models. Just like with people, the expectations machines have based on these models can be good (i.e. useful, predictive) or bad (i.e. misleading, caused by spurious correlations).

In other words, machine learning means that the computer is taking in data and using one algorithm (a machine learning algorithm) to produce another algorithm (the model). That resulting model is usually used to tell the computer how to handle future, similar information.

Key Concept: Machine learning means the machine is drawing conclusions about how information is related. Machine learning is evidenced by the machine producing a predictive model.

Types of Machine Learning: Supervised vs Unsupervised

Two main types of machine learning are supervised and unsupervised learning. It is a common mistake to think that unsupervised learning is a more sophisticated method of achieving the same goals as supervised learning, but this is not the case.

Supervised Learning

In supervised learning we give the computer a goal. We tell the computer, “I care about the value in this field and I want you to take all the other information I’ve given you to help me predict the value in this field.” That field is called the target variable. Supervised learning algorithms take in: 1) a data set, and 2) instructions on which field is the target variable. The learning algorithm then produces a model (algorithm / formula) to predict the value of the target variable.

Within supervised learning, there are two sub-types: regression and classification. The distinction relates to whether the target variable is numerical (regression) or categorical (classification).

Regression: If you ask the computer, what does a normal salary look like for a group of employees based on title, department, location, etc., then you are using regression. The target variable in this case is the salary. Regression algorithms are used when the target variable could be any numerical value. Auditors use regression to help uncover biases and outliers.

Classification: If instead you ask the computer, based on what we know about our historical timecards, overtime, and payroll adjustments, is this disbursement likely to be an improper payment, yes, or no. Classification techniques are helpful for diagnosing problems, automating selections and stratifying populations. Using the improper disbursement example, an auditor could develop a model that scans disbursements and assigns them to multiple risk categories such as High/Medium/Low or Full Review/ Sample Test/No Review Required. Another use of classification algorithms is diagnostic analytics. Instead of using the model to predict future values, the target variable flags problems and the model is used to uncover the cause of those problems.

Key Concept: Regression algorithms are used when the target variable could be any numerical value. Classification algorithms are used when the target variable is categorical or Boolean (true/false).

Unsupervised Learning

Unsupervised machine learning simply means that no target or goal is given to the machine. This means it has no objective. Unsupervised learning algorithms generally serve one of two purposes: 1) To group things that are similar, or 2) To reduce the number of fields you need to look at while working with your data. These two purposes are commonly called clustering and dimensionality reduction. These techniques, particularly clustering, can be by auditors in risk assessments, segmenting populations of entities (such as store locations, vendors, or customers).

Key Concept: Supervised learning means an objective is provided to the machine by a human. Unsupervised learning means no such objective is set.

Conclusion

This post helps explain some of the vocabulary associated with machine learning and offers some ideas of where and how to get started. Unsupervised learning is useful in the right situations, but for auditors just beginning to incorporate machine learning into their work, we recommend supervised learning.

It is easier to conceptualize and therefore auditors are quicker to recognize potential use cases. We’ll discuss use cases for both supervised and unsupervised learning in future posts.

To view or add a comment, sign in

A Machine Learning Primer for Auditors

David Poisson

What is Machine Learning vs Artificial Intelligence?

Types of Machine Learning: Supervised vs Unsupervised

Supervised Learning

Unsupervised Learning

Conclusion

More articles by David Poisson

Others also viewed

Decoding The Holy Grail Of Machine Learning?

Supervised Machine Learning Ensemble Techniques

Machine Learning is not magical it's tedious

Understanding Machine Learning

10 Things Everyone Should Know About Machine Learning

Ensuring Machine Learning Development Cycle Maintains Trust in the Finance Industry

Machine Learning: An Attempt to introduce with ML.NET

When Is a Semi-Supervised Machine Learning Useful?

Machine Learning Models For Healthcare Predictive Analytics

Machine Learning Models for Breast Cancer Risk Assessment

How to Optimize Machine Learning Performance

Automating Financial Audits With AI

Machine Learning and Employment Trends

Explore content categories

What is Machine Learning vs Artificial Intelligence?

Types of Machine Learning: Supervised vs Unsupervised

Supervised Learning

Unsupervised Learning

Conclusion

More articles by David Poisson

How to optimize classroom (or office space) in response to COVID-19 social distancing requirements?

Has work from home left cracks in your control environment?

Climbing Toward Innovation

How will “work from home” help us audit smarter in a post COVID19 world?

The greatest impact to how we work in a century

Auditing from Home – Mitigating the Risk of Wage Theft and Pay Theft with Analytics

The Impact of COVID-19 and Social Distancing on the Fraud Triangle

See Beyond Sampling for Internal Auditors

A Data Driven Approach to Evaluating Days Payables Outstanding (DPO)

Is your Company from Missing Out on Thousands($) in AP Discounts?

Others also viewed

Decoding The Holy Grail Of Machine Learning?

Supervised Machine Learning Ensemble Techniques

Machine Learning is not magical it's tedious

Understanding Machine Learning

10 Things Everyone Should Know About Machine Learning

Ensuring Machine Learning Development Cycle Maintains Trust in the Finance Industry

Machine Learning: An Attempt to introduce with ML.NET

When Is a Semi-Supervised Machine Learning Useful?

Similar topics

Machine Learning Models For Healthcare Predictive Analytics

Machine Learning Models for Breast Cancer Risk Assessment

How to Optimize Machine Learning Performance

Automating Financial Audits With AI

Machine Learning and Employment Trends

Explore content categories