FORCE Fault Mapping Competition - exploring the data
Examining the power-law distribution of the reflector amplitude; synthetic versus real data.

FORCE Fault Mapping Competition - exploring the data

Get the code! (MIT License)

This is an update to our earlier LinkedIn article on the FORCE Machine Learning Contest for seismic fault mapping, running through 16 October. In the earlier post we loaded the competition SEGY format data and made it conveniently available using ZArr. In this update we examine the "character" of the synthetic, compared to the real datasets; developing ideas for seismic focused "data augmentation" to help train models.

We've loaded the competition data and are sharing access to that, the provided notebooks will allow you to access algorithms and data as-is you don't need to download anything, just hack-away.

The 2020 FORCE Machine Learning Contest provides two synthetic data for developing fault classification algorithms, one from Equinor and one from Schlumberger. Synthetics are useful because they unambiguously provide a "ground truth" (see below), i.e. by construction a complex of faults is encoded in the data, and so the performance of a predictive algorithm can be objectively measured. Keep in mind however, the work of seismic interpretation is subjective and the results of the competition will be judged subjectively.

Equinor synthetic, acoustic impedance with fault labels overlaid.

Figure: shows the acoustic impedance of the synthetic model, with fault labels overlaid.

Our data exploration does two things, first we will load and render the Equinor synthetic data, with fault labels overlaid. Then we will derive some metrics to quantitatively compare the character of the synthetic data to that of the Ichthys3D. The ideas we explore may also be of use if a person wants to try and "augment" their training data (see example below) to help statistical models generalize.

No alt text provided for this image

Figure: shows a section of the original synthetic, a "spike convolutional" equivalent and a new higher bandwidth version. With comparison with the Ichthys3D.

Conclusions

We explored the data synthetic data with comparison to the real data. The good news is that we have the data loaded and ready to use through our ZArr repos which you can access. However, at least qualitatively the synthetics and real data seem to be very different in terms of bandwidth, the seismic wavelet, and the distribution of the reflectivity. Whether that will prove problematic for creating generalizable fault models remains to be seen, the ideas we present for augmenting the synthetics will hopefully prove helpful.

Next time; Mason Dykstra is going level the playing field a little by providing some expertly labeled faults on Ichthys3D. I will be demonstrating a simple fault classification model.

Like
Reply
Leo Dinendra

Computational Geoscientist, Data Analyst, ML Noob

5y

Very inspiring!

Like
Reply

thanks for sharing Ben!

Like
Reply

To view or add a comment, sign in

More articles by Ben Lasscock

Others also viewed

Explore content categories