Data science and deployment

Sujith S.

Published Feb 24, 2019

Back in the old days when I was a Mainframe programmer, when I used to code, I always knew how the end user would see the product. Over the years, programmers have mostly known and understood how their effort would translate into a product and how that product would be in the hands of the end user.

Enter #DataScience, the current crop of data scientists, talk mostly of jupyter-notebooks, python in an IDE and create models, which start on a laptop and stay there. These models generate results, which are then packed off to the requesting clients/customers. How long will this continue?

Can we call data scientists as programmers? After all, all that the customer needs is a working product. In addition, the definition of a product in the software industry has for long remained a working piece of software which provides results to aid the business and make it flourish. Hence, I do think that data scientists are programmers in a different dimension. In addition, that dimension is quickly maturing, with its own set of processes starting to set in. Data gathering must follow a strict process (a.k.a GDPR and regulations alike), Models that are created must not have an inherent bias, which they could have picked up from the underlying data (a loan default prediction program mostly indicating a default for a particular precinct), and results must be reproducible in the hands of the customer.

Finally, the critical component of making that software reach the user is the deployment. How long can data scientists remain aloof of the deployment conditions that they must adhere. We must start to emphasize and inculcate these in the recruitment processes in every data driven decision organization. Granted that AWS and Azure takes the headache out of creating services for your models and then you can embed those services in products, but we as programmers must surely know more in depth about that and care to uncover that last piece of layering that converts our models into a usable product.

The time to think of data science as a standalone cog is over. It must integrate with the processes that comes together with a software services product driven mentality.

#MLdeployment #Datascience #Datascienceproducts

Ajay Kuriakose 7y

What a man

2 Reactions

Amit kumar 7y

Great information Sujith S.

Arun Suresh 7y

Dilbert ultimate source of inspiration Suzie ? - even after all these years 😀 How you been doing man ?

See more comments

To view or add a comment, sign in

Data science and deployment

Sujith S.

More articles by Sujith S.

Others also viewed

Understanding Data Structures: The Backbone of Efficient Programming

Are you a Software Engineer or Data Scientist? Why not Both?

Idempotency in Data Pipelines : A Simple Concept with Big Implications

Spark Series #5 : Optimizing Storage Through Format Choices: File Storage Avatars Explored

Dear data scientist, add a `Makefile` to your project’s top-level directory

Quantico: Feature Engineering

Guide to Data Engineering: Moving Data with Confidence

Push Ifs Up and Fors Down - A Broader Perspective

Rethinking the Values

Explore content categories

More articles by Sujith S.

Inspiration

Deployment using light weight git server

Non-deterministic programming