Python - The New Found Love

Python - The New Found Love


The two data analysis software that I use extensively are SAS and Python (apart from R and SPSS). As I come from banking domain largely, SAS is still the software of choice for most of them as they have their basic systems in place on SAS.

In the last 2 years and working with Telecom, I have explored Python a lot and now it is the goto software for any client work that I do. Of course the fact that Python is free and SAS has a huge license cost, makes python more popular with industries like Retail, Telecom, Ecommerce, Healthcare etc.

SAS is a closed source software and does not have transparent functionalities. In contrast Python is a complete open source software that helps us look through the code to understand what goes behind the functionality.

SAS the clear preferred tool for handling large datasets with moderate computer resources. Since R and Python hold the data in RAM, they were not able to analyze the full mortality data. SAS uses hard drive space to hold the data and process. While generally slower due to the access speed of a hard drive compared to RAM.

In terms of data analysis , creating additional variables , dummy variables or aggregation of data , join and merge etc. , the lines of code that you need to write in SAS is much larger than what you would do in Python as Python has almost inbuilt functionality for everything. Although in the background the lines of code run, or time required maybe very similar. Python can finish a data analysis step in lesser number of lines of code.

Python and its use of a dozen different open source packages demonstrates the strength of the Python community in developing enhancements to the Python capabilities. Although not explored, Python being a general-purpose tool encourages participation from users outside the Data Science community which enhances package availability.

In SAS if you want to access advanced features, then you need to separately buy advanced add in tools, whereas in Python even advanced functionalities are all present in libraries.In terms of graphical representation, SAS has better output than Python and so if you are working with senior leadership and need to carry out presentations SAS is much better.

Personally in last 2 years , I have started preferring Python as it not only exposes me to advanced features , but as a data scientist also allows me to access newer algorithms to try out on the data.

Excited to try all new Python Directories and Libraries...especially once on Time Series Forecasting , CGE Modeling , Survival Modeling .....Panel Modeling....




Kavita, thanks for sharing!

Like
Reply

Javascript is the must have skill for all Data Scientists, am I right?

Like
Reply

I agree and love #python but like investments, in self #investment too don’t put all your #Eggs in one basket! Kavita Dwivedi

Like
Reply

To view or add a comment, sign in

More articles by Kavita Dwivedi

Others also viewed

Explore content categories