Data Scientists, they make clothes for Emperors right?
OK, so that's a bit mean but hear me out...
I've just spent three days in London at the Gartner Data & Analytics Summit in London. It's been a real mix of sessions and workshops. As tends to be the case with these things you see a fair amount of banner waving, buzzwords and predictions of the next big thing. Two years ago "big data" was BIG, and everyone was talking about it but no one was sure who was actually doing it (cue questionable jokes etc). This year it's all about deep learning, predictive analytics and Internet of Things. You have to have a "data lake" although some would prefer a data reservoir - either way it's gotta be big and it's got to be really fast.
Analytics and deep learning are said to be very important, but tangible solutions to problems demonstrating real ROI were thin on the ground. Maybe next year real examples and case studies will be in much greater supply, but right now it felt like there are a lot of ideas and good intentions but I couldn't help but get the impression it's still a bit of a solution looking for a problem.
What else? Well apparently, CIO's are passé. What you need now is a CDO (not to be confused with CDO as per "the big short") which turns out to be a "Chief Data Officer".
One of the best sessions on the 3rd day of the summit was "How a Chief Data Officer can lead high performance teams". The session was good - not because it really answered the title - but because it was a great discussion about the role of CDO's and "data scientists".
Now I know I'm not always the fastest to pick things up, but I hadn't heard of a data scientist until 18 months ago when I attended a breakfast organized by BathSpark. Not one miss an opportunity to put my foot in it I had to ask "What's a data scientist? How are they different from say a Mathematician, or a statistician or computer scientist?" Refreshingly, I got quite an honest response "Er, er, they're basically the same thing... I think".
So now, 18 months on, we must have a better idea about what a data scientist is? So I asked the same question again in the round table discussion. We're all supposed to have one, but what are they? What do they do? And are they different from mathematicians, statisticians and computer science graduates?
Well I'm glad I asked. It turns out that it is still not a daft question. And whilst Gartner have come up with a definition (of course that's their job), the rest of the room didn’t seem so sure.
So I thought I’d try one more time. Does anyone know what a data scientist is?
Please comment and let me know what you think. And while you're at it if you have any idea what a CDO does I'd love to know.
Personally, I think it's essentially down to the art of making use of the information. There are usually two key aspects to analytics - having the data in the first place, and then using it. Despite that seeming like a simple concept, there's always an underestimated step involved in the quality of that process. One of the first terms I heard in IT, way back in the mid 90's, was GIGO (Garbage In, Garbage Out). That is, if you are harvesting bad data, you're running the risk of producing bad analytics out of it. Will a statistician be aware, a computer scientist care, or a mathematician notice? With that in mind, a Data Scientist should have responsibility for the whole process, perhaps summarised by their responsibilities: - Collect: Know what data is and isn't required. - Validate: Identify and filter out bad data. - Translate: Calculate, Condense and Concatenate where appropriate and required. - Organise: Present the cleansed Data in an effective and useful manner. Is a Data Scientist a role where data integrity intersects data utilisation? Perhaps the question shouldn't be "What is a Data Scientist", but rather "How else can you ensure robust, functional and accurate data without a clearly defined role and process?".