1.2s → 85ms. The LATERAL join 90% of devs have never written. Four nested Python loops. One slow endpoint. I replaced all of it with one Postgres LATERAL join. Most backend devs don't know it exists. Full breakdown (7 min read) → https://lnkd.in/gyPiyjsQ #PostgreSQL #Database #SQL #BackendEngineering #DataEngineering
Optimize Backend Performance with Postgres LATERAL Join
More Relevant Posts
-
One Python expression, 22+ SQL dialects, zero rewrites 🐍 Running queries across multiple databases often means rewriting the same logic for each backend's SQL dialect. A query that works in DuckDB may require syntax changes for PostgreSQL, and another rewrite for BigQuery. Ibis removes that friction by compiling Python expressions into each backend's native SQL. Swap the connection, and the same code runs across 22+ databases. Key features: • Write once, run on DuckDB, PostgreSQL, BigQuery, Snowflake, and 18+ more • Lazy execution that builds and optimizes the query plan before sending it to the database • Intuitive chaining syntax similar to Polars 🚀 Article comparing Ibis with other libraries: https://bit.ly/3MnsHs7 #Python #DataScience #SQL
To view or add a comment, sign in
-
-
myflames v1.2.0 — now with MariaDB support! myflames is an open-source tool that visualizes MySQL EXPLAIN ANALYZE output as interactive flame graphs, bar charts, treemaps, diagrams, and execution trees. No dependencies, pure Python. What's new in v1.2.0: → Full MariaDB 10.11 and 11.4 support → Auto-detects MySQL vs MariaDB JSON format — zero configuration → ANALYZE FORMAT=JSON, SHOW ANALYZE FOR, and SHOW EXPLAIN FOR → All 5 visualization types work with both databases → Works with any mariadb CLI flag combination (-e, -N, -r, -s) https://lnkd.in/d9yEY34y Enjoy! #mysql #mariadb #readyset #acedirector
To view or add a comment, sign in
-
-
I got tired of constantly copying and pasting similar code snippets, so I asked Gemini and Claude to help me organize everything—and ended up publishing it as a crate. For integration tests, you often just need a simple “connection” to Redis or PostgreSQL. But in practice, it gets tedious: isolating databases, setting up schemas, and repeating the same boilerplate over and over. It’s something I’ve implemented many times across Rust, Python, Go, and Java, using different frameworks and databases. This time, I decided to wrap it into a convenient helper for Rust. I hope it can be useful for the community! #rust #diesel #sqlx #redis #valkey #testcontainers #docker #tests Crate: https://lnkd.in/dX4Nn76a Repo: https://lnkd.in/d3cG2ZRY
To view or add a comment, sign in
-
I just published a piece on something I keep seeing in Python APIs: using SQLAlchemy by default — even when it’s not needed After working more directly with PostgreSQL, I started questioning this habit. Because the database is not just storage — it’s a core part of performance and system behavior. In many APIs, especially simple or performance-critical ones, I’ve found that: - ORM adds unnecessary abstraction - raw SQL gives better control over query shape - PostgreSQL features are easier to leverage directly and in some cases, it actually improves performance due to lower overhead So I wrote about: ->> when ORM makes sense ->> when it becomes overengineering ->> and why I prefer asyncpg + raw SQL in many cases Do you stick with ORM everywhere, or go raw SQL when performance matters? https://lnkd.in/dzZ7xvCS #python #postgresql #fastapi #backend #softwareengineering
To view or add a comment, sign in
-
Finally back after a long break Built a an async worker job Queue where a heavy HTTP request is processed in background so that users dont have to wait for one process to finish and can simultaneously work on multiple requests. Its possible due to async workers working in background which one request comes picks it up form the database(PostgreSQL or Redis etc) and works on the in background while users are shown pending status. I also added a retry logic with exponential back-off that means a failing request will be retried by workers after some exponential time by Max 3 times which even if still not completed is sent to dead letter queue whose error message could be viewed manually in database. Full Code:- https://lnkd.in/gCEV3C7j #Python #FastAPI #AsyncIO #BackendDevelopment #WebDevelopment
To view or add a comment, sign in
-
-
Spark Connect — I kept seeing this term but never really understood what problem it was solving. So I dug deeper. Before Spark Connect, the client and Spark driver were tightly coupled. Your PySpark script ran directly inside the driver process. This meant: → Heavy dependency overhead (matching Java, Scala, Python versions) → Client crashes could take down the driver → Building non-JVM clients was a difficult process → PySpark relied on Py4J to bridge into the driver's JVM Spark Connect changes all of this by clearly separating the client and the server. Here's the simplified flow: 1. The client converts your DataFrame or SQL query into an Unresolved Logical Plan 2. That plan is serialized using Protocol Buffers 3. Sent to the Spark server via gRPC 4. The server deserializes, optimizes, and executes it 5. Results come back as Apache Arrow record batches — streamed, not dumped all at once The result? The client no longer needs a full Spark installation. The server can be updated independently. And since the entire communication stack (gRPC + Protobuf + Arrow) is language-agnostic, building Spark clients in Python, Go, Rust — much simpler. Check out my detailed write-up on Spark Connect on Medium 👇https://lnkd.in/gmZegTXn #ApacheSpark #PySpark #SparkConnect #DataEngineering
To view or add a comment, sign in
-
-
UDFs vs. Native Functions: Don’t Reinvent the Wheel (It Just Gets Slower)! Are you a Python master who loves creating custom functions (UDFs) to apply to your PySpark DataFrames? 🛑 Stop right now! Although UDFs may seem like the perfect solution, they are Spark performance’s “kryptonite” because they force data serialization between the JVM (Java) and Python, killing speed. The evolution of PySpark has brought hundreds of native functions in the pyspark.sql.functions module. Using native functions instead of Python UDFs can make your code 10 to 100 times faster, since they run directly on the JVM with the Catalyst Optimizer (Source: Spark Performance Benchmark, 2022). Want a practical example? Instead of creating a UDF to convert strings to uppercase and add a value, look for native functions such as upper(), concat(), when(), col(). Almost everything you need already exists in an optimized form! Leave UDFs only for extremely complex business logic that cannot be solved with built-in functions. It’s the difference between riding a bicycle and taking a rocket! What was the most complex UDF you managed to replace with native functions? Tell us in the comments! #DataEngineering #PySpark #Productivity
To view or add a comment, sign in
-
-
I had 18,115 AWS API operation names in PascalCase that needed to become kebab-case. DescribeInstances to describe-instances. PutBucketAcl to put-bucket-acl. AWS's acronym casing is inconsistent across services, and I was not writing a custom Python converter for 18,000 edge cases. DuckDB has a community extension for this: INSTALL inflector FROM community; LOAD inflector; SELECT inflector_to_kebab_case('DescribeInstances'); -- describe-instances All 18,115 operations in one SQL pass. It also does snake_case, camelCase, train-case, pluralization, and bulk column renaming on structs. I used it to keep the raw PascalCase botocore contract in parquet and transform at query time — no slow Python string manipulation. https://lnkd.in/e8a_Aitd #duckdb #dataengineering #platformengineering #aws
To view or add a comment, sign in
-
PydanTable 1.17.0 has been released, and MongoDB is now officially part of the story. This release introduces an optional MongoDB execution engine, allowing work to remain on the MongoDB database side when supported. This means you can materialize data only when it is actually needed in the application, rather than pulling full result sets into Python first. Additionally, this version adds integration with Beanie, a popular Python ODM (object-document mapper) for MongoDB built on Pydantic. If your application already models MongoDB documents with Beanie, PydanTable can seamlessly integrate with that layer, ensuring your document models and typed, table-shaped workflow remain aligned without the need for a parallel schema. For more details, check out the documentation and release notes: - PyPI: https://lnkd.in/ez4NZMjT - Documentation: https://lnkd.in/eV4RTqZQ - Repository: https://lnkd.in/eVpjrcRX #Python #Pydantic #MongoDB #DataEngineering #OpenSource
To view or add a comment, sign in
-
🚨 Faced an interesting SQL issue recently while working with AWS Batch and Python. Queries started taking longer, and parallel jobs were getting blocked — turned out to be due to a small setting: autocommit = false in pymssql. Wrote a quick blog on how this caused table locking and how we fixed it 👇 https://lnkd.in/d2YTE7aP Would love to hear if anyone faced something similar! hashtag #SQLServer hashtag #Python hashtag #LearningInPublic
To view or add a comment, sign in
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development