“So what?” : The Last Mile Problem in Data Science

“So what?” : The Last Mile Problem in Data Science

The last mile problem is a common phenomenon in telecommunications, internet, and the electric power industry, and can even be generalized to petroleum, natural gas, or other industries. In essence, the problem refers to the relative difficulty of the very last step in delivering the product to the customer. In these industries, efficiencies can be created by concentrating production in a small number of large-scale activities (think a single power plant that can provide electricity for hundreds of thousands of homes) and by restricting interactions to a relatively small number of technical experts (think transmission line engineers talking to substation engineers about transmission-to-substation connectivity).

Once these problems have been taken care of, however, we are left with the problem of delivering the final product to a large, heterogenous consumer base who just want their energy (or internet) - they want it cheap and they want it now. The 5 years in permitting and regulatory negotiations, the 4 years in construction, the countless hours negotiating and ultimately settling with environmentalists and myriad other lobby groups, the cost overruns, changing interest rates, the contractor and subcontractor disputes - all meaningless. “So what? Where’s my damn electricity (or internet or water or cell service)?”

In data science - in particular, analysis for humans, though I’m sure there are similar problems associated with delivering analysis for machines - there is a similar last mile problem. We begin by spending weeks identifying a promising business question to attack, then spend weeks wading through terabyte upon terabyte of raw data, followed by additional weeks analyzing the results, and then iterating. If we’re (really) lucky, by this point we’ve got some solid results and have developed a really solid intuition for the problem. Now, all that’s left is to summarize the results in a deck and that’s it. Right? Well, sort of.

The hundreds of hours spent generating and distilling your results certainly matter. After all, we want our conclusions to be solid - and reproducible. But when it comes to communicating the results to the final customer - product manager, engineer, VP, or CXO - they don’t want to hear about it. This last 1% of work - the “so what?” - is 99% of what they care about. For busy people with business problems to solve, hearing about MapReduce mayhem, messy data, heteroskedasticity boogiemen, and incomplete information is a very poor way to spend 30 minutes.

Now,all of this seems perfectly intuitive. “Of course the Chief Revenue Officer only cares about how this proposal will boost revenue in Q4!” Unfortunately, it’s a trap we all fall into - over and over again. There are a number of reasons for this, not least among them being that we as data scientists are really excited about technology and algorithms and data and love to geek out about all of them. On that count, the solution seems to be no more than a health dose of self-restraint. But there is another reason for this, a reason which I believe represents an important and really tricky trade off for anybody in R&D: How do we not only demonstrate value, but also market ourselves to justify the expenditures on our all-too-hidden efforts?

In general, I’ve found that data scientists are terrible marketers. Personally, I’ve always intuitively believed in what I call a “res ipsa liquitur” (Latin for “the thing speaks for itself”) approach to analysis. Unfortunately, this is a naive and dangerous mindset against which I’m constantly battling. Then when we do attempt to market, too often we find ourselves verbosely recapitulating the problem rather than succinctly summarizing the solution.

The best solution I’ve heard for this to date is one I picked up recently at a LinkedIn training seminar: the elevator pitch. The technique, as the name implies, is to have a description of the product so succinct that you could describe it to your CEO in the elevator (in a few floors, of course). Even better, we should always frame the pitch around the following formula: “For <target audience>, I will provide <product> that will deliver <value proposition.” Short and tight, you can nail it in 2 floors or less.

What does this mean for your deck? Less is more - in words and in slides. Since what we do tends to be very data heavy, we can and should have additional slides - I’ve found that a 10:1 “extra/pocket” slides to main deck ratio is absolutely reasonable. Keep the main deck as tight and economical as possible so you can deliver what is absolutely essential, and then provide all of the answers to the questions you hope they’ll ask as they try to poke holes in your masterpiece in the appendix. As I’m struggling to come to terms with, I have found that your presentation is successful precisely when they do engage and ask questions. It is a sign of failure, not of success, when your audience has nothing to ask. Brevity, not verbosity, covers the last mile.

If you’ve any comments or tips on how to make better data science presentations - or how to market data science projects in general - please share!

Very well written! Asking and answering the hard question of "so what" would guide us to better analysis, greater impact and a stronger brand.

Like
Reply

I love the last mile comparison! Great visualization.

Like
Reply

To view or add a comment, sign in

More articles by Jerrod Lowmaster

  • Data Scientist: Full Stack Problem Solver

    I wanted to take the opportunity to present a new definition of a data scientist: a “full stack problem solver”. I…

    8 Comments
  • All Hail EDA – An Epistemological Cheer

    “It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so.

  • The Skills Gap is the Easy Part

    After studying Arabic and Islamic political philosophy at the University of Chicago, I spent the first five years of my…

    4 Comments
  • Simple Aint Easy

    March 8th, 2006 at 1AM, Central time was my last cigarette. For most of the five years before that, I was a…

    1 Comment

Others also viewed

Explore content categories