Vyshak S V’s Post

Here's a small thing that changed how I read code. Python naming convention: `lower_case_with_underscores`. PySpark: `groupBy`, `orderBy`, `printSchema`. For a while I just accepted it as "quirky." Then I learned PySpark is a Python API sitting on top of Apache Spark — a JVM engine built in Scala, where camelCase is standard. That "quirk" wasn't random. It was a signal. Abstractions leak. And when they do, the details they leave behind — naming conventions, error formats, edge case behaviors — are actually clues about the system underneath. Once you start seeing code this way, you stop being confused by inconsistencies. You start getting curious about them. What's leaking through tells you more than the documentation ever will. #Python #PySpark #Engineering #DataEngineering

To view or add a comment, sign in

More Relevant Posts

Shubham Bansode
1w
Report this post
Standardized date columns to yyyy-MM-dd 👉 Convert all date formats to a standard format Python from pyspark.sql.functions import to_date, date_format df = df.withColumn("loan_date", to_date("loan_date", "yyyy-MM-dd")) # If already date but want formatting: df = df.withColumn("loan_date", date_format("loan_date", "yyyy-MM-dd"))
Like Comment
To view or add a comment, sign in
Zishan Arif
1mo
Report this post
Working on Real World Data Problems Using Pure Python Recently worked on a project focused on handling and analyzing structured data using core Python without relying on libraries like NumPy or Pandas. The goal was to understand the logic from the ground up. Cleaned and structured raw JSON data Built logic for “People You May Know” (mutual connections) Implemented “Pages You Might Like” recommendations Focused on problem-solving using basic data structures This approach helped me strengthen my core data handling and logical thinking, rather than depending on pre-built tools. Late nights after work, but worth it for the growth. #Python #DataProcessing #DataScience #ProblemSolving #CorePython #Algorithms #NumPy #pandas
Like Comment
To view or add a comment, sign in
Mahalakshmi S
3w
Report this post
Python Data Types — One Post Cheat Sheet Understanding data types is fundamental to writing efficient Python code. Here’s a quick overview: 🔢Numeric int → 10 float → 10.5 complex → 2+3j 🔤 String (str) Ordered & immutable Example: "Hello Python" 📋 List Ordered, mutable, allows duplicates Example: [10, 20, 30] 📦 Tuple Ordered, immutable Example: (10, 20, 30) 🔁 Set Unordered, no duplicates Example: {10, 20, 30} 📖 Dictionary Key–value pairs, mutable Example: {"name": "Maha", "age": 25} 🧠 Boolean True / False Used in conditions 🔍 Check Type type(variable) Choosing the right data type improves performance, readability, and data handling. #Python #DataTypes #PythonBasics #Programming #LearnPython #Coding #DataAnalytics #PythonForBeginners
Like Comment
To view or add a comment, sign in
Sandeep Verma
1mo
Report this post
📘 **Day 11 – File Handling in Python** Today I learned about **File Handling in Python** 📂 👉 File handling allows us to **create, read, write, and update files**. It helps store data permanently instead of keeping it only in memory. 🔹 **Types of Modes:** * `r` → Read file * `w` → Write (overwrites file) * `a` → Append (adds data) * `x` → Create new file 🔹 **Basic Example:** ```python # Writing to a file file = open("example.txt", "w") file.write("Hello, Python!") file.close() # Reading from a file file = open("example.txt", "r") print(file.read()) file.close() ``` 💡 **Best Practice:** Use `with` statement (auto closes file) ```python with open("example.txt", "r") as file: data = file.read() print(data) ``` ✨ **Key Learning:** File handling is important for saving data like logs, user input, and reports. 🚀 Step by step becoming better in Python! #Day11 #Python #CodingJourney #FileHandling #SkillCourse #DataAnalyst
Like Comment
To view or add a comment, sign in
Adebayo Rhema Omoyeni
3w
Report this post
Mastering Data Ingestion: Why NumPy is the Standard For anyone working with numerical data in Python, the transition from built-in functions to NumPy is a game-changer. While Python’s open() function handles basics, NumPy arrays offer a level of efficiency and speed that standard lists simply cannot match. Why use NumPy for flat files? The Industry Standard: NumPy arrays are the backbone of the Python data ecosystem. Essential for ML: If you plan to use libraries like scikit-learn, your data needs to be in a NumPy format. Built-in Efficiency: Functions like loadtxt() and genfromtxt() make importing arrays seamless. Pro-Tips for np.loadtxt() When importing data, the real power lies in the customization arguments: delimiter: Remember that the default is whitespace. For CSVs, always specify delimiter=','. skiprows: Perfect for bypassing headers (e.g., skiprows=1) so string labels don't break your numerical array. usecols: Optimization starts at ingestion. Only grab what you need by passing a list of indices, like usecols=[0, 2]. dtype: Control your data types from the start (e.g., dtype='str'). The Catch While loadtxt() is excellent for clean, uniform datasets, it hits a wall with mixed data types (like the Titanic dataset). When your columns vary between strings and floats, it’s time to level up to genfromtxt() or move into the world of Pandas. #DataEngineering #python #Numpy #Learninginpublic
Like Comment
To view or add a comment, sign in
Mimoune Djouallah
1w Edited
Report this post
New blog post on dynamic vertical scaling in Microsoft Fabric Python notebooks. It’s a nice trick that can be really useful, especially with unpredictable workloads, it is not new, but was not really documented another reaosn why Fabric Pipeline are awesome ok tested it with 158 GB of csv. https://lnkd.in/gpKysemw #onelake #python #Microsoftfabric #pipeline #polars #duckdb #lakesail
12 Comments
Like Comment
To view or add a comment, sign in
Ricardo García Ramírez
3w
Report this post
Most Python classes I've seen in DS projects do too much! They load data, clean it, transform it, run the model, and log results... all in one place. It feels efficient until you need to change one thing and have to re-test everything else. That's the cost of ignoring the Single Responsibility Principle. 🐍 In my latest article, I break down what SRP actually means for Python data pipelines: https://lnkd.in/esKz_ARk This is post 1 of 5 in a series on SOLID principles applied to Data Science code. What's the messiest class you've inherited on a DS project? 👇 #Python #DataScience #SoftwareEngineering #SOLID #DataEngineering

Single Responsibility Principle in Python: One Class, One Job blog.devgenius.io
Like Comment
To view or add a comment, sign in
Muhammad Abuzar
2w
Report this post
Most beginners learn SQL and Python separately. Today, I connected both. I extracted real data from MySQL and processed it using Python (pandas) in VS Code. This is where data science actually starts not just writing queries, but turning raw data into something useful. Still early in my journey, but now I’m focusing on building complete workflows instead of isolated skills. Next step: deeper analysis and real insights. #DataScience #SQL #MySQL #Python #Pandas #DataAnalytics #DataWorkflow #LearningByDoing #TechSkills #FutureDataScientist
2 Comments
Like Comment
To view or add a comment, sign in
freeCodeCamp

2,235,553 followers
2w
Report this post
If you want to improve your Data Science and Python skills, this course is for you. You'll use popular Python libraries like Pandas, scikit-learn, and NumPy to extract and clean data, then analyze it. You'll also learn about grouping & aggregation functions, merging datasets, and using regex, plus some Machine Learning techniques, too. https://lnkd.in/gK3gfthg
4 Comments
Like Comment
To view or add a comment, sign in
Real Python

206,520 followers
1w
Report this post
🐍📰 pandas GroupBy: Your Guide to Grouping Data in Python In this tutorial, you'll learn to work adeptly with the pandas GroupBy facility while mastering ways to manipulate, transform, and summarize data with real-world datasets. #python

pandas GroupBy: Your Guide to Grouping Data in Python realpython.com
Like Comment
To view or add a comment, sign in

784 followers

15 Posts

View Profile Connect

Vyshak S V’s Post

More Relevant Posts

Explore content categories