Evaluating the value of a machine learning prospectivity map
Mineral potential maps (or prospectivity maps) have been used for several decades in the industry to help with selecting the most prospecting ground for exploration. These maps are the best way to integrate all available data and help decide were to go next. In the last decade, the use of data-driven and machine learning approaches to mineral potential mapping has become mainstream. Companies and consultants (including myself) propose machine learning prospectivity analysis to their clients and research papers are published every months on the subject.
Open source tools and public data allow everyone to try and test machine learning for mineral potential mapping. This is a great opportunity for all of us to improve the current practices and propose new solutions and approaches to the mineral exploration community. However, not everyone has experience with machine learning and not everyone is aware of the pitfalls of training an algorithm. Machine learning is complex and requires rigorous approaches. Published examples, both from the industry and from academics show workflows that would scare a machine-learning scientist with results that nonetheless look plausible to a geoscientist.
A specific concern I want to address in this article is models evaluated on training data. A machine learning algorithm learns a model from existing training deposits. We try to use this model to predict were we will be most likely to discover new deposits. It is common practice to estimate the model performance by looking at how many known deposits have been predicted by the model. This evaluation should NEVER be done on the deposits used to train the model. An algorithm can be extremely good at (re-)discovering example training data, and poor at predicting new deposits. This is called model overfitting. As an example, I have trained an algorithm on some Australian public data and estimated performance on both the training deposits, and on validation deposits that were not used as input data in the algorithm.
I obtained an accuracy score of 0.96/1 on training deposits. Amazing! But then, performance on validation data was only 0.58/1.... Not that good... The figure below illustrates well the problem. All training deposits are included in the 5% of the most prospective ground, while it takes 20% of the ground to include only 90% of the validation deposits.
Evaluation of a model performance on training data will misguide all subsequent decisions-making. The wrong model will be chosen to produce a mineral potential map, and decision-makers will be over-confident in the map ability to outline prospective ground. Make sure to always estimate model performance on hidden validation data, and if you are the client always request such a validation. You can even keep a few deposits hidden ask your consultant to try and predict them once the model is trained. Otherwise, you might end-up taking the wrong exploration decisions and loose money instead of better focus your exploration.
Great work Antoine!
Machine learning and prospectivity mapping is a tool. It isn’t a solution. Like all tools it will have application in certain areas and will fail dismally in others. Its dependence on data and the current paradigms is its biggest downfall. Relying on ML while simultaneously trying to be the first mover into an area will be a difficult fit, as often the first mover is also the first to acquire the data the ML needs to make its assessment - by the time the ML has the data density it needs for meaningful results, it’s too late. Also, how many deposits have been found that don’t fit the accepted paradigm of the day and that require a whole new round of research to understand how they formed? ML will never predict a new exploration paradigm. So it’s a tool, not the answer.
Antoine. Great article, keep it up. Another potential trap of the application of ML to exploration is that cells or blocks are never truly independent. Depending on how you setup your model, neighboring cells will have almost the exact same raw data because you may have interpolated data to fill cells that are incomplete (exploration data is messy and clustered). So if you selected your training set as a "random" selection you are still learning and predicting from cells that may contain essentially the exact same raw data; thus your prediction of success will be biased (the smoother your interpolation, the higher your "success"). So many traps; common sense and good geology will always be necessary.