Two types of data scientists: analysts and LLM users

5mo

There are (broadly) two types of people called 'data scientist' working today. 1. Those that perform analysis or run models on data, using many languages. 2. Those that try to get LLMs to deliver responses in the way they need them to, mostly using Python. Number 2 is starting to look a lot like 'Software Engineer'. Meanwhile, a lot of what I hear from Number 1 is that Generative AI has ruined the fun of their work. #analytics #rstats #python #datascience #peopleanalytics #ai #technology

27 Comments

Marcelo C.

Applied ML Researcher | Time Series & Physics-Informed Modeling | End to End Systems | xAI

5mo

From my experience working on operational atmospheric ML models, I sometimes feel there’s a “third type” that isn’t mentioned here. The people who have to model real-world systems — weather, aviation, energy, biology — where data alone is not enough, and the key challenge is understanding the underlying physics or dynamics of the process. LLMs are incredibly useful as accelerators, but they don’t replace the need to frame the problem properly or to translate a physical system into a mathematical one. In that sense, the fun hasn’t disappeared. If anything, it’s increasing — because tools are faster, but the reasoning still matters.

2 Reactions

John Mataya 5mo

This has been somewhat true for years - the trend of software engineering meging or absorbing data science was there prior to LLMs - it’s only gained momentum with the proliferation of GenAI. The unicorns exist for those that can do both well and combine them in creative ways.

1 Reaction

Martin Ingram 5mo

Can’t agree that the fun in 1 is spoilt. I can think more about how to approach a problem and then get help coding up some of the more boring parts of it. It’s a challenge to find the balance between trusting the generated code too much (because it sometimes messes up) and not using it enough (because then you waste time), but overall it’s been a great boost for me.

3 Reactions

Gregory McConney 5mo

It was shocking at first to see how well generative AI can finish and alter SQL queries so easily, but since embracing it my life has become much easier. Frees up a lot of time to actually analyze the data, strategize about new projects, and provide proactive insights.

10 Reactions

Edward Leland 5mo

The power comes from 1 + 2. LLMs are incredibly useful and amazing for prototyping, helping to wire up analysis tools from other languages you might not be an expert in, to say nothing of unstructured data processing. Handmade models and analysis are still incredibly important for bespoke workflows or datasets that are not easily managed by and LLM. A great data scientist is someone who can jack of all trades a data pipeline from ingestion to deployed product. LLMs are just another tool on the belt.

Joseph Fresco 5mo

I actually think LLMs have opened up a whole new door for prototyping production ML models. For example, if I'm working on a time-to-fill model, I can quickly spin it up as a simple executable and hand it off for others to use like a calculator. It's really expanded how freely I can explore the back end and even opened up new opportunities. I can now 'vibe code' functional programs into production that might have otherwise had to purchase from a vendor, and give the freedom of design control. In my experience, hallucinations are rarely an issue if you know how to read and debug the code.

11 Reactions

See more comments

To view or add a comment, sign in

More Relevant Posts

Afnan Mehmood Awan
5mo
Report this post
Augmented Analytics enabling non-experts to glean insights: In 2025, this area continues to evolve. As a data scientist, I'm excited to explore how it shapes our world. Python's ecosystem offers incredible tools to experiment and learn. What are your thoughts on this trend? #DataScience #MachineLearning #Python #AI
Like Comment
To view or add a comment, sign in
Chetan Chavan
6mo Edited
Report this post
Fake News Detection using Machine Learning I built a Fake News Detection model that classifies articles as Real or Fake using Python ,Scikit-learn and TF-IDF Vectorizer. – Data preprocessing & feature extraction using TF-IDF – Logistic Regression for classification – Achieved ~95 % accuracy on test data – Implemented in Google Colab and uploaded on GitHub Project Link: [https://lnkd.in/gEqUfWfc) #MachineLearning #AI #Python #DataScience #FakeNewsDetection #MLProjects #GitHub
1 Comment
Like Comment
To view or add a comment, sign in
Akshita Singh
6mo
Report this post
Data analytics lays the foundation — mastering SQL, Python, and visualization teaches us how to interpret information. AI builds on that foundation — using machine learning and automation to make systems smarter and more adaptive. It’s fascinating how the same data that once told a story can now drive decisions on its own. That’s the true evolution — from analyzing patterns to building intelligence. #DataAnalytics #ArtificialIntelligence #MachineLearning #CareerGrowth #Python #DataScience #AI #Analytics #ContinuousLearning #TechTransformation
Like Comment
To view or add a comment, sign in
Shivam Saxena
5mo
Report this post
Level up your AI stack in 2025: these Python tools cover everything from data pipelines to MLOps, so you can ship reliable models faster and prove impact. Prioritize niche expertise, add original takeaways, and spark discussion—the algorithm now rewards helpful insights, focused topics, and meaningful comments over generic virality. What’s the one tool here that 10x’d your workflow this year—and why? #AI #ArtificialIntelligence #Python #DataScience #MachineLearning #MLOps #GenerativeAI #Analytics #DataEngineering #LLM #dataanalysis #analysis #AI
Like Comment
To view or add a comment, sign in
Abhishek Mungoli
6mo
Report this post
Want to code Logistic Regression from scratch without relying on libraries? In my latest video, I break down the math, derive the gradient descent update rules, and implement the entire model step by step in Python. Perfect for anyone looking to understand the core logic behind ML algorithms or preparing for interviews. Video Link: youtu.be/cT_U40djaww Channel Link: youtube.com/@datatrek #datatrek #datascience #machinelearning #statistics #deeplearning #ai
Like Comment
To view or add a comment, sign in
Paul Hylenski
6mo
Report this post
Unlock Predictive Modeling with Regression in Python Did you know that over 70% of data science projects fail due to lack of foundational understanding? That’s right! Without a solid grasp of the basics, predictive modeling can feel like navigating a maze blindfolded. If you're aspiring to build predictive models, here’s where you should start: ↳ Define your question clearly. ↳ Collect and clean your data using pandas. ↳ Split your data into training and testing sets. ↳ Fit a linear model using scikit-learn's LinearRegression. ↳ Check your metrics (R², MAE) and iterate your approach. Master the fundamentals, and watch your confidence soar! Pick one dataset today and fit your first linear model—progress beats perfection. #MachineLearning #DataScience #Python #PredictiveAnalytics #AI
2 Comments
Like Comment
To view or add a comment, sign in
Atish Basu
6mo
Report this post
🤖💡 AI is changing data analytics — but does that mean Python is outdated? Not really. AI can generate your code, build dashboards, and even suggest insights… But without Python, you can’t truly understand, verify, or customize what it creates. Here’s the reality 👇 🔹 AI gives you speed, Python gives you control 🔹 AI automates tasks, Python helps you validate and scale 🔹 AI suggests, but Python lets you question and explore deeper In my own workflow, I use AI to speed up repetitive Python tasks — but I rely on my Python knowledge to refine and trust the results. That’s how I keep the best of both worlds: AI for efficiency, Python for accuracy. AI + Python = A smarter, more independent analyst. 🚀 #DataAnalytics #Python #AI #MachineLearning #DataScience #Upskilling #CareerGrowth #AIForAnalytics
Like Comment
To view or add a comment, sign in
Mohini Ganjare
6mo
Report this post
🌟 Thrilled to dive into the Decision Tree Algorithm — one of ML’s most interpretable and versatile models! 🧠 In this practical, I explored Python 🐍 (Scikit-learn) implementations, experimenting with Gini vs. Entropy and tree depth 🌳 to see how they impact accuracy and predictions 📊. Hands-on experience like this really highlights how Decision Trees pick the most important features to make smart, data-driven decisions 💡. Huge thanks to Ashish Sawant Sir for the guidance! 🙏 🔗 GitHub: https://lnkd.in/ez_NstrZ 📁 Google Drive: https://lnkd.in/ezXFx_py #MachineLearning #DataScience #DecisionTree #Python #ScikitLearn #AI #DataDriven #MLPracticals #LearningByDoing #TechJourney
Like Comment
To view or add a comment, sign in
HyperColab

51 followers
5mo
Report this post
Understanding Data Science Made Simple! Data Science isn’t just about coding; it’s the perfect blend of Statistics, Math, Python, Machine Learning, and Domain Knowledge. Each step builds on the other, from Data Analytics to Machine Learning, and finally, to full-fledged Data Science. Keep learning, keep exploring, that’s how data turns into insight! #DataScience #MachineLearning #Python #AI #DataAnalytics #LearningJourney #HyperColab
Like Comment
To view or add a comment, sign in
Amarjeet Kumar
6mo
Report this post
Week 5 of my AI & Data Science journey 🚀 This week, I explored Python Memory Management — a crucial concept for writing efficient and scalable programs. Key learnings: Understanding how Python allocates and manages memory Exploring the heap, stack, and reference counting mechanism Working with the garbage collector (gc module) Analyzing memory leaks and optimization techniques for data-heavy applications Efficient memory handling is key to ensuring ML models and data pipelines run smoothly — especially when working with large datasets. 📂 Notes & Assignments: https://lnkd.in/gPnQkhGY #Python #DataScience #AI #MachineLearning #MemoryManagement #LearningJourney #CodeOptimization
Like Comment
To view or add a comment, sign in

76,307 followers

View Profile Follow

Two types of data scientists: analysts and LLM users

More from this author

A Somewhat Psychopathic Introduction to Bayesian Statistics

Can a Horse Do Math? The Story Of Clever Hans and a Statistics Problem He Inspired

A Fun Introduction to the Concept of Bayesian Statistics

Explore content categories