Data Scientists Assemble

Data Scientists Assemble

With the release of the new Captain America: Civil War film a regular debate reared its head in my team: “who’s your favorite Avenger?*” I know, I know – we really smash that geeky stereotype. This generally leads to hours of debate as to why Iron Man would beat Thor but not The Hulk.

There seems to be a similar underlying appetite in the Data Science community to pit ‘thing A’ against ‘thing B’ in a fight to the death, similar to a Marvel Battle Royale. Lines are drawn, arguments / counter-arguments are created and then let battle commence until there is only one side standing, breathing heavily in the dirt over the vanquished foe… until next time.

An example of this can be found in the analytic software chosen by a Data Scientist to analyze data. One of the first big debates was open source vs proprietary software. The proponents of open source champion its flexibility, low cost and ease. ‘Ah’, the proprietary champions cry, ‘but what about the lack of support, documentation or scalability issues?’. Assuming you go for open source tools further civil war breaks out: python vs R vs Vowpal Wabbit vs… This has now reached the next level in my team with the rise of 2 factions within the R coding language: those who prefer data manipulation using the dplyr package versus those who espouse the value of data.table.

 

Stop it!

 

My frustration with all of this is that we run the risk of missing the point – we’re doing this analysis for a reason. As Data Scientists we need to be relentlessly focused on solving the business problem by using our wide knowledge of different tools/techniques and applying the right tool to the right job. It is definitely important to continue to try new approaches and innovate but not to the detriment of what makes the job family so precious – going from data to insight.

 * The answer by the way is Hawkeye ;-)

I think the worst internal war in the data science community is this idea of a "fake data scientist", which is usually applied to those with a different background than the accuser. There aren't any "fake biologists" or "fake electrical engineers" so why has "fake data scientist" become such a popular phrase? This does no good except provide fodder to those who claim that data science is nothing more than BI or stats dressed up in fancy new clothes.

Great article, Dan! (and you meant Ant Man, right??)

Like
Reply

I agree, Dan. Lots of technological arm wrestling going on at the moment. Better bin my planned next article on The Silver Surfer (Flink) vs The Flash (Spark)...

I love the artwork...do you have similar art for all the Avengers, and can I get that framed? ;-)

Like
Reply

To view or add a comment, sign in

More articles by Dan Kellett

  • My 4 microblogs on AI governance

    Over the last 4 weeks I have looked to cover key learnings from my 21 years being involved in the governance of Machine…

    1 Comment
  • Data karma

    AI success relies on a large amount of knowledge. This may be technical knowledge, data knowledge or business knowledge.

    2 Comments
  • Goldilocks and SQL

    Last week I wrote about my early years as a data scientist and the challenge of jumping the experience chasm as I moved…

    2 Comments
  • Wise council

    I joined Capital One straight out of university. I completed my Bachelors degree in Mathematics and Statistics and…

    1 Comment
  • The Jets and the Sharks

    This week I want to tell you a story about one of my earliest model building projects. I was a recent graduate making…

    1 Comment
  • My 8 microblogs on AI model building

    Over the last 8 weeks I have looked to cover key learnings from my 21 years building Machine Learning models in…

  • Occam’s Razor

    Buying a new car can be a pretty daunting experience unless you know exactly what you want. Deciding on a make and…

  • Opening up the watch

    Imagine it’s your birthday and there’s a knock on your door. The delivery person hands you a beautifully wrapped parcel…

    1 Comment
  • Help out your future self

    I’ll be honest with you… I actually really enjoy building flat pack furniture. The step-by-step approach appeals to my…

    2 Comments
  • What can go wrong... and what will you do about it?

    Sometimes, despite everyone’s best intentions, things go wrong. Good risk management can help ensure this doesn’t have…

    3 Comments

Others also viewed

Explore content categories