Polars or pandas for dataframes? I recently asked one of the developers, and this is what I found: 🖥️From a technical perspective, there is little reason to remain with pandas: 👉Polars is significantly ahead. It has addressed many of the long-standing issues pandas has struggled with, while offering a clearer API and much faster performance. 👉Pandas is unlikely to change dramatically, while polars is evolving quickly. That means the tech gap between the performance of the 2 libraries will continue to widen. In practice: 👉Few people move from polars to pandas, while many users are transitioning from pandas to Polars. 👉Still, pandas is huge compared to Polars. In fact, if you check the summary made by MLcontests about the data science competitions in 2025, you’ll notice that Pandas is still the go-to library for dataframe manipulation, used in 61 competitions vs 5 using polars. 💡Pandas popularity will not change overnight, which means that pandas will likely remain widely used and, for a long time, more popular overall. So, which library should you use? In short: 👉Are you new to Python and dataframes ⇒ then learn polars 👉Working with legacy code? You are not alone and pandas is here to stay for many years, so your learnings will not be wasted Which library do you use? Let me know in the comments 👇 #machinelearning #ml #dataframes #polars #pandas #mlonline #mlcourse #trainindata #datascience #datascientist #dataengineer #dataengineering #mleducation #mlcareer #ai #python
I've posted this a number of times, I learnt Pandas using Felix Zumstein book on "python & excel" which was a perfect grounding, used that for 2 years. Transitioned over to Polars a year back, and never looked back. Polars is a great advance development on Pandas, albeit a few fixes/changes still needed. Got and read through enough Pandas & polars books to know both inside out but Pandas now long on the shelf gathering dust due to Polars. Even for small datasets see no need to use Pandas. I use Polars heavily and recommend to all, duck dB etc are decent but Polars is the best by far! .
I've moved almost all my projects from pandas to polars. I plan to use Claude Code to rewrite the last project from pandas to polars. Moving forward, it's hard to imagine I would ever start a project in pandas again.
I have used pandas for so many years that writing it's (really criticized and awkward) API feels like second nature to me. At the same time, I recognize that the way polars is designed just makes more sense, and it's way faster (even if new pandas' releases are supporting pyarrow as a backend now too), so I am trying to slowly incorporate it into my workflow and going 100% with it for new projects. That being said, with the vast amount of training data in pandas, I am sure the rise of agentic coding favors pandas over polars, would love to get a proper dataset on that info.
Polars for new projects, pandas for anything touching existing pipelines. The ecosystem is what keeps pandas ahead, not the performance. That argument is settled. The adoption argument is still open. Curious, does Polars offer the same level of ML library integration?
I'll add that if you are coming from R then polars translates better.
It is very clear to me that version 3 of Pandas was strongly influenced by Polars making inroads into Pandas users.
Pandas for standard EDA and Python scripts, Polars for relatively big data
Polars any day. Changed the way I saw python
Before going all the way Polars, in MHO I would give it a try to adding Dask into the mix. I find that tools like feature-engine are the heroes of the day, any given day, and FE works with Dask. As Dask works in Lazy mode, you have to design with that in mind.