Saiteja Singirikonda’s Post

5mo

Day 22/50: The Pagination That Tanked at Scale The Setup:- Report works fine with 10,000 rows. With 1M rows, query times out. The Problem:- Deep pagination with LIMIT/OFFSET. Query had to scan and discard 999,980 rows just to get to page 100,000. The Investigation:- Query looked innocent: --- python # Page 100,000 with 10 items per page SELECT * FROM orders ORDER BY id LIMIT 10 OFFSET 999990 Database scanned 999,990 rows unnecessarily before returning 10. --- The Solution:- Switched to keyset pagination (cursor-based): --- python # Instead of offset, track the last ID last_id = request.GET.get('last_id') results = Order.objects.filter(id__gt=last_id).order_by('id')[:10] Time dropped from 12 seconds to 0.2 seconds. ---- The Lesson:- LIMIT/OFFSET doesn't scale. Use cursor-based pagination with indexed lookups for large datasets. '''Have you been burned by deep pagination?''' #Day22 #50DaysOfDebugging #Database #Pagination #Performance #Scalability #SoftwareEngineering #Django #SQL

To view or add a comment, sign in

More Relevant Posts

vishal sharma
6mo
Report this post
Ever wondered what __init__.py actually does? While setting up my RAG project, my folder looked like this. rag-retriever/ │ ├── src/ │ ├── __init__.py │ ├── data_loader.py │ ├── chunk_splitter.py │ └── semantic_split.py │ └── main.py At first, I kept getting import errors — until I understood the real role of __init__.py. It’s simple yet powerful: 1) It tells Python that “this folder is a package.” 2) It lets you import cleanly from one file to another. So now I can: Inside src/: use from .semantic_split import function_name Outside (in main.py): use from src import chunk_splitter That tiny file makes the entire structure modular and production-ready 🚀 --- In short: __init__.py is your folder’s identity card — it transforms random scripts into a real Python package 📦 #Python #LLM #RAG #MLOps #SoftwareEngineering #DataScience #CodingTips #LearningJourney

2 Comments
Like Comment
To view or add a comment, sign in
Ajmal roshan
6mo
Report this post
Speed up your Frappe queries with frappe.get_cached_value() If you’re repeatedly fetching the same field value from the database, don’t use frappe.db.get_value() every time — it hits the database on each call. Instead, use frappe.get_cached_value() . It stores the result in memory (cache) and returns it faster on the next request. When to use it? Use frappe.get_cached_value() when :- Fetching non-changing fields like settings, configs, defaults You want better performance with fewer DB calls Accessing single fields from large DocTypes When NOT to use :- When the data changes frequently When you need fresh DB values every time When fetching multiple rows — use frappe.get_all() or get_list() instead Use caching smartly — small optimizations add up in big Frappe apps 💪 #Frappe #ERPNext #Backend #Performance #OpenSource #Python Rushabh MehtaHussain NagariaEjaaz Khan Aditya Hase Sherin K R Ritvik Sardana Manish Dipankar Frappe efeone
Like Comment
To view or add a comment, sign in
Saiteja Singirikonda
5mo
Report this post
Day 21/50: The Connection Pool That Slowly Died The Setup:- Application running smoothly. After 48 hours: "No more connections available in pool.". The Problem:- Connections leaked. Database connections never got closed, accumulating until pool exhausted. The Investigation:- -> Enabled connection pool logging: -> Found database queries never closing connections in finally blocks: ---- python # The culprit cursor = db.cursor() result = cursor.execute("SELECT * FROM users") # No cleanup! return result ---- -> After 48 hours of traffic, all 20 pool connections were open and abandoned. The Solution:- -> Used context managers for automatic cleanup: ---- python with db.connection.cursor() as cursor: result = cursor.execute("SELECT * FROM users") return result # Connection auto-closes on exit ---- Or explicitly close in finally: ---- python try: cursor = db.cursor() result = cursor.execute("SELECT * FROM users") finally: cursor.close() # Always runs ---- The Lesson:- -> Connection leaks are silent killers. Always use context managers or try-finally blocks. -> Monitor pool metrics regularly—don't wait for crashes. ""Ever lost a pool to leaks? Tell your story."" #Day21 #50DaysOfDebugging #Python #Database #ConnectionPooling #SoftwareEngineering #Debugging #BestPractices
Like Comment
To view or add a comment, sign in
Reddi Sekhar Palyam
6mo
Report this post
Handling Missing values-Part2 Deletion Strategies Sometimes, removing rows or columns with missing values is the best approach. Completely Empty Rows: Use df.dropna(how='all') to remove rows that are entirely blank. Critical Column Missing Values: For important columns, remove rows where any of them are missing: df.dropna(subset=['column1', 'column2', 'column3']). ----Replacing Missing Values The strategy for replacing missing values depends on your data type. For Categorical/String Columns: Mode: Fill with the most frequent value. df['gender'] = df['gender'].fillna(df['gender'].mode()[0]) Meaningful Default: Use a descriptive placeholder. df['comments'] = df['comments'].fillna('No comment provided') For Numeric Values: Median or Mean: Choose based on the column's distribution and outlier sensitivity. Median (less affected by outliers): Ideal for age. df['age'] = df['age'].fillna(df['age'].median()) Mean: Often used for income. df['income'] = df['income'].fillna(df['income'].mean()) Mode: Can also be used for certain numeric categories like ratings. df['customer_rating'] = df['customer_rating'].fillna(df['customer_rating'].mode()[0]) -----#DataPreprocessing #MissingValues #DataScience #MachineLearning #Python #Pandas #DataCleaning
Like Comment
To view or add a comment, sign in
Gaurang Sharma
5mo
Report this post
Solving LeetCode - 1045. Customers Who Bought All Products IN SQl - select customer_id from Customer group by customer_id having count(distinct product_key)=(select count(distinct(product_key)) from Product) IN PANDAS- import pandas as pd def customers_who_bought_all_products(customer: pd.DataFrame, product: pd.DataFrame) -> pd.DataFrame: total_products = product['product_key'].nunique() customer_counts = (customer.groupby('customer_id')['product_key'].nunique().reset_index()) result = customer_counts[customer_counts['product_key'] == total_products][['customer_id']] return result #LeetCode #Python #Pandas #SQL #DataEngineering #CodingPractice #ProblemSolving #DataAnalysis #SQLtoPandas #googleanalytics
Like Comment
To view or add a comment, sign in
Ahmad Makki
5mo
Report this post
....some cool numpy: # Suppose we have a text file "data.txt" with rows of numbers like: # 1.0 2.0 3.0 # 4.0 5.0 6.0 # 7.0 8.0 9.0 # Python code: import numpy as np def main(): data = np.loadtxt("data.txt") print("data loaded from file:") print(data) print("shape: ", data.shape) print("mean of all values:", np.mean(data)) print("column sums:", np.sum(data, axis=0)) col2 = data[:, 1] print("second column: ", col2) if __name__ == "__main__": main() -------------------------- # output: data loaded from file: [[1. 2. 3.] [4. 5. 6.] [7. 8. 9.]] shape: (3, 3) mean of all values: 5.0 column sums: [12. 15. 18.] second column: [2. 5. 8.]
Like Comment
To view or add a comment, sign in
Sakshee Singh
6mo Edited
Report this post
The one SQL function I couldn't live without this week: Window Functions. 🤯 When you need to perform calculations across a set of table rows that are related to the current row (like running totals or rank), nothing beats a powerful Window Function like ROW_NUMBER(), RANK(), or LAG(). It saved me hours on a recent project where I needed to track user activity sequences. It's a game-changer for time-series analysis. If you're still doing this in Python/Pandas after pulling the data, trust me, learn to do it in SQL first. Your database and your query speed will thank you! What's your go-to advanced SQL function? Let me know! #SQL #DataScience #Python #DataEngineering #TechnicalSkills #AnalyticsTools

1 Comment
Like Comment
To view or add a comment, sign in
Pramod Chillar
6mo Edited
Report this post
Step15.. ...towards Data Science and ML model creation ====Numpy======== 1. Numpy is a library of python which is used for mathematics and Logical operation 2. Numpy use operation on multidimention array ## Addition Operation of single array ' import numpy as np sum = np.sum([10,11,11,2,0]) sum 2nd -: import numpy as np a =10 b = 20 sum= np.sum(a+b) print(sum) 3rd -: import numpy as np sum1 = np.sum(range(1,10)) print(sum1) === ##Substraction import numpy as np a =10 b = 20 diff = np.subtract(a,b) print(diff) ## import numpy as np diff = np.subtract(a,b) print(diff) === import numpy as np ## Multiplication a =100 b = 20 mul = np.multiply(a,b) print(mul) === ## Division import numpy as np a =100 b = 20 div = np.divide(a,b) print(div)
Like Comment
To view or add a comment, sign in
Sameuzzaman Serat
6mo
Report this post
df = pd.read_csv('data.csv') 👉 This reads a CSV file named data.csv and loads it into a DataFrame called df. Think of a DataFrame as a super-powered table—like an Excel sheet but in Python! df[df['column_name'] == 123] 👉 This filters the data to show only rows where the value in the column 'column_name' is equal to 123. Great for zooming in on specific data. df.groupby('column_name').mean() 👉 This groups the data by values in 'column_name' and calculates the average (mean) for each group. Perfect for summarizing and finding trends.
Like Comment
To view or add a comment, sign in
Rafsan Ahmed Al
5mo
Report this post
Today I revised some of my SQL concepts and practiced a few Python loops to strengthen my logic-building skills. Here’s a quick glimpse - SQL Practice: Created a View (vw_Category_Profit) using CTE + Subquery to calculate total revenue and total cost per category. Then built another query using two CTEs to calculate Total Profit and extract the Top 3 categories by profit. It’s amazing how much clarity comes when you connect concepts like CTEs, Views, and Joins together! - Python Practice: Nested for loops to print pattern combinations Practiced looping through two lists (Colors & Sizes) Wrote a while loop and a limited-attempt for loop to build simple user input validation logic Each day I try to connect what I know with what I learn new. SQL builds structure; Python builds logic — together, they’re the backbone of Data Analytics. #SQL #Python #DataAnalytics #LearningJourney #ContinuousLearning #CareerGrowth #LinkedInLearning #PracticeMakesPerfect #CTE #View #Loops
Like Comment
To view or add a comment, sign in

2,786 followers

87 Posts

View Profile Connect

Saiteja Singirikonda’s Post

More Relevant Posts

Explore content categories