Chris Perry’s Post

Scalable queries aren’t just about working today. They’re about performing when your data grows 10x, 100x, or more. Overall goal Minimize the ratio of rows processed to rows returned. Do that by: - Selecting only the columns required - Making use of partitions and indexes - Including a temporal filter Ask yourself, “Will the query today that reads 100 rows work 6 months from now when it is reading 100 million?” Write solutions that will scale into the future. #sql #queryoptimization #learnsql #dataanalytics

  • graphical user interface, text

To me, unless the query is referencing a table tailored for a specific reporting and analytics project / purpose, the absence of a WHERE clause is a red flag.What sticks out to you?

Like
Reply

I would also point out that the more complex the query needs to be made, the more likely we are storing the data differently than the way we are using the data and we should review. I do tend to look for things that could potentially or are creating table scans - this could be calculations on columns in joins and WHERE Clauses. Another thing to look for is the need to use key words like UNIQUE when a join is used - mainly to ask why the return would have unwanted duplicates - possibly a badly thought-out join.

Select * is an easy way out of doing the hard work you recommend. I take advantage of the “preview the data” (right click on a table) feature in an IDE. And that will generally build the query out with full field names. From there delete the unnecessary ones.

See more comments

To view or add a comment, sign in

Explore content categories