Question your data before querying it
I discovered a new concept today: survivorship bias. I was reading an article posted on a Facebook group about model airplanes (showed up in my feed…not sure how well the recommender AI was working there!).
The article explained how avionics engineers in WWII were trying to figure out how armor on their airplanes could be further improved. They gathered data from aircraft surviving combat missions, mapping out areas on the fuselage and wings that received the most damage, positing that those were the areas that needed better protection. Long story short, they eventually discovered that they were not looking at the right data. The most relevant data was not from planes that had survived, but from planes that did not! Better armor was required in areas hit on the planes that went down, not on those that made it back.
Coincidently, I was also deep into chapter three of The Digital Mindset by Paul Leonardi and Tsedal Neeley, a must read that was suggested by Hassan el Bouhali in one of his posts. Data gains power as we use it for mere description (classification), to prediction (e.g. machine learning) and finally to prescription (strategy). If you are not mindful of survivorship bias, you risk starting off with the wrong data, which can lead to exponentially worse strategies.
I brainstormed with myself (there should be a term for that) and went looking for situations where survivorship bias might come into play within my enterprise:
· Quality issues (post-delivery): Should we include data from customer complaints only, or also include data from products still on the shelf?
· Innovation: Should we focus only on our highest margin products for innovative ideas, or include our entire product line? How about revisiting our retired products?
· Service levels: Should we survey only employees who used our services, or should we include all those that did not?
The takeaway here is that you should not just settle for the data you are given to work with. You must think critically and question the source, understand the context, and create more data if needed to avoid survivorship bias and taking off in the wrong direction.
Do you have any examples of survivorship bias that you have come across in your work? Do you have any tips and tricks for avoiding it? What kinds of questions should we ask?
Links to related articles:
Cover image photo by Denys Nevozhai on Unsplash