THIS QUERY LOOKS CORRECT. IT IS NOT. Most people think this query is correct. It runs. It returns results. But the logic is completely broken. Business problem: Find the latest product review for each customer based on their most recent completed order. Tables involved: orders - order_id - customer_id - order_date payments - payment_id - order_id - status - payment_date reviews - review_id - order_id - review_text - review_date At first glance, the logic seems simple: Join orders → payments → reviews and pick the latest order per customer. But real data doesn’t behave like that. - One order can have multiple payments - One order can have multiple reviews (edits / updates) - Joins create duplicate rows - “Latest” becomes ambiguous if not handled carefully So even if your query runs, you might be picking the wrong review. Think before answering: Are you selecting the latest order? Or the latest review? Or a random row created by joins? Fix the logic, not just the syntax. Comment your answer. Repost if this made you think. Follow Harish Chatla more real-world data problems. Subscribe to practice on our platform. #DataRejected #SQL #DataEngineering #DataAnalytics #DataScience #LearnByDoing #TechCareers #Analytics #CodingPractice
The query is pulling the last order from each customer, not the last review from each customer order.
The query is ranking orders instead of reviews, the join is causing duplication, valid orders aren't filtered out and the latest order isn't selected. SELECT o.customer_id, r.review_text FROM orders o JOIN payments p ON o.order_id = p.order_id AND p.status = 'completed' JOIN reviews r ON o.order_id = r.order_id QUALIFY ROW_NUMBER() OVER ( PARTITION BY o.customer_id ORDER BY o.order_date DESC ) = 1 AND ROW_NUMBER() OVER ( PARTITION BY o.order_id ORDER BY r.review_date DESC ) = 1;
So quite honestly the problem with this question seems to the design of this system itself. - There is no concept of customer on a review level - not great design. - There is no concept of product. bad design. How can one have a product review on an order level? - The fact that orders and reviews are both at a daily level - not great design. Timestamp level of granularity to allows for better time based analysis - and reduces duplication. - The fact that the payment status is in a seperate table is not great design, unless there are statuses that can be partial payments - in non completed/declined statuses Forcing users to create complex joins for a literal order level piece of information is added complexity for no reason. - The joins being done are simply at order level - order id can be duplicated - even across the same date - let alone the fact that this join is simply on order id. this is very problematic at both review and payment level. - Review are left when orders aren't completed? This seems like a huge issue for site credibility. Good analysts code around bad design. Great analysts help guide change that makes everyone's lives easier
I'd want to look at the data resulting from this query, but I would consider ordering by the r.order_id desc if the data reveals the order ids are sequential. But having said that, I would probably pop the query in copilot to see what that brings back, but I didn't do that. Fun
The query is not the latest review on that order as it is clearly mentioned that one order can have multiple revies. That is the piece missing from business requirement.
Why the order doesn’t have status attribute. Instead it’s only in payment? Dies it mean if the payment status is “complete” then the order is “complete”? Very strange!
In my view, the posted query will contain duplicates in the result because latest is not cleanly coded here as the task requires to return the latest review and order. One way to do is to first find the latest completed order in CTE. Then latest review and return the result.