FORCE Fault Mapping Competition - exploring the data

Ben Lasscock

Published Sep 4, 2020

This is an update to our earlier LinkedIn article on the FORCE Machine Learning Contest for seismic fault mapping, running through 16 October. In the earlier post we loaded the competition SEGY format data and made it conveniently available using ZArr. In this update we examine the "character" of the synthetic, compared to the real datasets; developing ideas for seismic focused "data augmentation" to help train models.

We've loaded the competition data and are sharing access to that, the provided notebooks will allow you to access algorithms and data as-is you don't need to download anything, just hack-away.

The 2020 FORCE Machine Learning Contest provides two synthetic data for developing fault classification algorithms, one from Equinor and one from Schlumberger. Synthetics are useful because they unambiguously provide a "ground truth" (see below), i.e. by construction a complex of faults is encoded in the data, and so the performance of a predictive algorithm can be objectively measured. Keep in mind however, the work of seismic interpretation is subjective and the results of the competition will be judged subjectively.

Equinor synthetic, acoustic impedance with fault labels overlaid.

Figure: shows the acoustic impedance of the synthetic model, with fault labels overlaid.

Our data exploration does two things, first we will load and render the Equinor synthetic data, with fault labels overlaid. Then we will derive some metrics to quantitatively compare the character of the synthetic data to that of the Ichthys3D. The ideas we explore may also be of use if a person wants to try and "augment" their training data (see example below) to help statistical models generalize.

Figure: shows a section of the original synthetic, a "spike convolutional" equivalent and a new higher bandwidth version. With comparison with the Ichthys3D.

Conclusions¶

We explored the data synthetic data with comparison to the real data. The good news is that we have the data loaded and ready to use through our ZArr repos which you can access. However, at least qualitatively the synthetics and real data seem to be very different in terms of bandwidth, the seismic wavelet, and the distribution of the reflectivity. Whether that will prove problematic for creating generalizable fault models remains to be seen, the ideas we present for augmenting the synthetics will hopefully prove helpful.

Next time; Mason Dykstra is going level the playing field a little by providing some expertly labeled faults on Ichthys3D. I will be demonstrating a simple fault classification model.

Kurt Rucker 5y

This might interest you too Friso Brouwer

Leo Dinendra

Computational Geoscientist, Data Analyst, ML Noob

Very inspiring!

Matteo Niccoli 5y

thanks for sharing Ben!

See more comments

To view or add a comment, sign in

FORCE Fault Mapping Competition - exploring the data

Ben Lasscock

Conclusions¶

More articles by Ben Lasscock

Others also viewed

SFCC image calling on GEE

🧠 AI for Geoscientists Series — Week 3

Filling the Gaps: Using Radial Basis Function Interpolation for Real-World Time Series Data

The Intelligence Gap That Linear Models Can't Close

One Minute Overview of Variational Autoencoders (VAE)

Introduction to Swarm Intelligence

Topological Data Analysis and Strings Part I

Unleashing Swarm Intelligence on Graph Technology: Navigating Complex Optimization Challenges

Symbolic Regression: Deciphering Nature's Equations

Understanding frameworks for Statistical Models

Explore content categories

Conclusions¶

More articles by Ben Lasscock

So I cloned the OSDU data ingestion repository - and then this happened

We're loading 71,880 Geothermal data files, here's why you should care.

SEG2020 Be Like!

Fault Mapping: Competition easy-as

Scattering Convolutional Networks

Come train a deep learning model for seismic interpretation.

Volve dataset - Command line access

Deep Learning in Well Log Prediction Part 2 – Strength Through Diversity

Using Deep Learning to Predict Well-log Facies

Others also viewed

SFCC image calling on GEE

🧠 AI for Geoscientists Series — Week 3

Filling the Gaps: Using Radial Basis Function Interpolation for Real-World Time Series Data

The Intelligence Gap That Linear Models Can't Close

One Minute Overview of Variational Autoencoders (VAE)

Introduction to Swarm Intelligence

Topological Data Analysis and Strings Part I

Unleashing Swarm Intelligence on Graph Technology: Navigating Complex Optimization Challenges

Symbolic Regression: Deciphering Nature's Equations

Understanding frameworks for Statistical Models

Explore content categories