Geoscience and Machine Learning – EAGE 2017 Workshop
Machine Learning takeaway word cloud on Mentimeter by Matt Hall from Agile* Cc-By 4.0

Geoscience and Machine Learning – EAGE 2017 Workshop

The EAGE 2017 has started and I took part in a Monday workshop. These are my main takeaways. The article originally appeared on "The Way of the Geophysicist" here: http://the-geophysicist.com/geoscience-machine-learning-eage-2017-workshop

Is it data science? Machine learning? Big Data? Deep learning? Fancy math? Or just chalked up statistics with enough data?

This Monday marked the start of the EAGE 2017 in Paris for me. If you have read this blog or the EAGE Student Newsletter before, you may have seen that I am some sort of a proponent of machine learning. Therefore, it was natural for me to go to the workshop named "Geoscience and Data Sciences", which Matt Hall from Agile* usually called "the machine learning workshop".

There were two keynotes, several good talks and two poster sessions, where some good discussion was had. Although the EAGE has eased a little on the Powerpoint only constraint, so it was possible to show some Chrome websites, it still seemed fairly limited in its extent. But, listening closely, I will give you my main takeaways:

The hackathon from the weekend before was mentioned several times. It seems to catch on with the crowd, although a ML hackathon and a ML workshop may have some congruence. I will blog about the amazing hackathon later.

Personally, I was amazed that Shell let on that they are using Scikit-learn, which is a popular machine learning library in Python. Total even hinted to contributing to open source projects. Every talk was clear, however, that open data will not be happening. Possibly ever.

Teradata talked about data lakes. A talk that was clear that something has to happen in companies to change their relationship with data. Seg-Y and LAS are great transfer data types but sub-par to terrible to work on. More sophisticated types like HDF5 are a good way to get your seismic data into a better direction, Globe Claritas is using this standard developed for storing data for astronomical research.

One very good talk was about unstructured data. I had heard parts of the talk at the hackathon before, where they won the award for "Originality". Unstructured data is somewhat feared in data science. It's much nicer when you can query a database and get your information. Having to sift through badly scanned pdf files is a whole nother dimension of complexity. However, the presenter showed some great results using object recognition and segmentation in computer vision to analyze and classify legacy documents using ML. Interestingly, they even built in QC directly.

My personal favorite was Matt Hall's keynote. It talked about actual machine learning and how our industry has to adapt to the fast pace of open source development in the deep learning community. He freshened it up using menti.com for audience participation and asked some tough questions.

Generally, I would love to see some deep learning in the next workshop and more bleeding edge research on this. Copenhagen is the next chance for this. I hope I can put my writing where my mouth is and maybe you can join me!

Please subscribe here to follow any updates: http://the-geophysicist.com/subscribe


Nice write up, Jesper! One of my key learnings was that Powerpoint is the wrong tool for presenting about data science. Next year I hope it will be a minority for that task now the EAGE has loosened their constraints on how a presentation can be delivered. Being able to see live code and interactive visualizations is a very powerful mode of communication.

The name of the game is maximising economic recovery read about it here: https://www.ogauthority.co.uk/media/3229/mer-uk-strategy.pdf - its a legal obligation

Jesper, thank you for sharing this, a very useful write-up. I was intrigued by your comment that "Seg-Y is a great transfer data type but sub-par to terrible to work on." We recently used a cloud based seismic data lake to collaborate with a colleague whilst they were in the air, flying to their next meeting. It would be great to understand what is perceived as a blocker. Could you elaborate on which workflows people were finding challenging?

Daniel and Simon, I think I was unclear on that point. My bad. The way I understood the discussion, companies are quite protective of their data and they would prefer not to share that wealth of information. I do see that more and more data is made available and the role of national data repositories is becoming more and more important. I'm looking forward to the first timelapse seismic data with a permissive license though.

Like
Reply

Jesper, Dan is right. Open data is, where legally permissible, being promoted by the Oil and Gas Authority. There is interest from other regulators too. As basins become more mature, this becomes even more important. The right assets in the right hands. The right data in the right hands. This means data democratisation.

To view or add a comment, sign in

More articles by Jesper Dramsch

Others also viewed

Explore content categories