Have you ever been stuck trying to write a SQL query that needs to perform a sub-calculation for each row of your main table? Maybe you needed to find the last 3 blog posts for every user, or the top 5 products for each category.
These “for-each” style problems can lead to complicated, slow, and hard-to-read queries. This is the exact problem the LATERAL
JOIN was designed to solve.
First, let’s clarify a critical point. While it’s common to hear “lateral join”, the true power comes from the LATERAL keyword itself. It isn’t a type of join like INNER or LEFT; it’s a modifier that fundamentally changes how a subquery in the FROM clause behaves.
Think of “lateral” as meaning “sideways.” A standard subquery in a FROM clause is executed in isolation and cannot reference other tables in the same clause. The LATERAL keyword breaks this rule. It gives a subquery permission to look sideways at columns from tables that appear before it in the FROM list.
This enables a powerful for-each loop pattern directly in your SQL.
Let’s create an e-commerce scenario with customers, products, and orders to demonstrate LATERAL joins in action using PostgreSQL.
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100)
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
amount INT,
items JSONB, -- To store a list of products in the order
tags VARCHAR[] -- To store tags like 'gift', 'rush_delivery'
);
-- Seed the tables with versatile data
INSERT INTO customers (customer_id, customer_name) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie');
INSERT INTO orders (order_id, customer_id, order_date, amount, items, tags) VALUES
(101, 1, '2025-01-15', 25, '[{"product": "Mouse", "price": 25.00}]', '{"sale"}'),
(102, 1, '2025-06-20', 120,'[{"product": "Webcam", "price": 45.00}, {"product": "Keyboard", "price": 75.00}]', '{"work", "rush_delivery"}'),
(103, 1, '2025-07-01', 15,'[{"product": "Mousepad", "price": 15.00}]', '{}'),
(104, 2, '2025-02-25', 250,'[{"product": "Monitor", "price": 250.00}]', '{}'),
(105, 2, '2025-07-10', 80,'[{"product": "Keyboard", "price": 80.00}]', '{"urgent"}'),
(106, 3, '2025-04-05', 120,'[{"product": "Headphones", "price": 120.00}]', '{"gift", "rush_delivery"}');
So, let’s solve some challenges:
1- Selecting Top N Rows per Group
Goal: Find the 2 most recent orders for each customer.
When you encounter this problem, you might think of some available options. Let’s examine each option, and finally, we will use the LATERAL option.
Option 1: The Window Function Approach
This is a common way to solve top-N problems, but it requires multiple logical steps.
WITH RankedOrders AS (
-- Step 1: Scan all orders and assign a rank to each one within its customer group
SELECT
c.customer_name,
o.order_date,
o.items,
ROW_NUMBER() OVER(PARTITION BY c.customer_id ORDER BY o.order_date DESC) as rn
FROM
customers c
JOIN
orders o ON c.customer_id = o.customer_id
)
-- Step 2: Select from the ranked results and filter for the ones you want
SELECT
customer_name,
order_date,
items
FROM
RankedOrders
WHERE
rn <= 2;
Why it’s less ideal: While it works perfectly, the logic is indirect. You have to create a temporary ranked dataset of all orders and then query it again.
Option 2: The Correlated Subquery Approach
A traditional correlated subquery is often used to fetch a single piece of related data.
-- This only works for a single value, not a full row.
SELECT
c.customer_name,
(SELECT o.order_date FROM orders o WHERE o.customer_id = c.customer_id ORDER BY o.order_date DESC LIMIT 1) AS latest_order_date
FROM
customers c;
Why it fails for this problem: The fundamental limitation of a scalar subquery (one used in a SELECT
list) is that it must return exactly one column and one row. It’s physically impossible to use this technique to fetch the two full order records (with multiple columns like order_date
and items
) that our problem requires. It’s the wrong tool for the job.
Option 3: The GROUP BY
Approach
GROUP BY
is built for aggregation, not for retrieving individual rows from a group.
-- This finds the latest date, but loses all other details of that order.
SELECT
c.customer_name,
MAX(o.order_date) AS latest_order_date
FROM
customers c
JOIN
orders o ON c.customer_id = o.customer_id
GROUP BY
c.customer_name;
Why it fails for this problem: The entire purpose of GROUP BY
is to collapse many rows into a single summary row. It can give you the latest date, the total count, or the average amount, but it discards the individual rows in the process. You cannot use it to select the “top 2” original rows from the group.
Option 4: The LATERAL
Solution
SELECT
c.customer_name,
recent_orders.order_date,
recent_orders.items
FROM
customers c
LEFT JOIN LATERAL (
SELECT
o.order_date,
o.items
FROM
orders o
WHERE
o.customer_id = c.customer_id
ORDER BY
o.order_date DESC
LIMIT 2
) AS recent_orders ON true;
As you see, it reads like a direct translation of our request: “For each customer, get their two most recent orders.”
Tip: The ON TRUE
clause is used since we’re not joining on specific columns but including the subquery’s results.
But why is it better than the Window Function approach? The LATERAL
approach feels more natural because it integrates the filtering (LIMIT 2
) directly into the “for-each-customer” step, avoiding the separate ranking-then-filtering process.
Result:
customer_name | order_date | items |
Alice | 2025-07-01 | [{"product": "Mousepad", "price": 15.00}] |
Alice | 2025-06-20 | [{"product": "Webcam", "price": 45.00}, {"product": "Keyboard", "price": 75.00}] |
Bob | 2025-07-10 | [{"product": "Keyboard", "price": 80.00}] |
Bob | 2025-02-25 | [{"product": "Monitor", "price": 250.00}] |
Charlie | 2025-04-05 | [{"product": "Headphones", "price": 120.00}] |
2- Unpacking JSONB Order Details
Goal: Find all customers who have ever ordered a ‘Keyboard’.
SELECT DISTINCT -- Use DISTINCT to only list each customer once
c.customer_name
FROM
orders o
JOIN
customers c ON o.customer_id = c.customer_id
-- For each order 'o', expand its items into a table called 'item_details'
JOIN LATERAL jsonb_to_recordset(o.items) AS item_details(product TEXT, price DECIMAL) ON true
WHERE
item_details.product = 'Keyboard';
The function jsonb_to_recordset
turns a JSON array into a virtual table. LATERAL
lets us run this function for every single order row.
Result:
customer_name |
Alice |
Bob |
💌 Enjoying this post? Subscribe to get more practical dev tips right in your inbox.
3- Searching Array Data With UNNEST
Goal: Find the names and order dates for all orders tagged with ‘rush_delivery’.
SELECT
c.customer_name,
o.order_date,
tag_value
FROM
orders o
JOIN
customers c ON o.customer_id = c.customer_id
-- For each order 'o', expand its tags array into a single-column table
, LATERAL unnest(o.tags) AS tag_value
WHERE
tag_value = 'rush_delivery';
The unnest
function expands an array into a set of rows. LATERAL
lets us do this for each order’s tags
array.
However, PostgreSQL allows you to omit LATERAL
in some cases implicitly. When you use a set-returning function like unnest()
in the FROM
clause with a table column as argument (e.g., o.tags
), PostgreSQL treats it as a LATERAL join implicitly. So, in this case you can remove LATERAL
and the query will work; but it is best practice to keep
Tip: Keep in mind that the ,
is a shorthand for CORSS JOIN
; which means it should be CROSS JOIN LATERAL
.
Result:
customer_name | order_date | tag_value |
Alice | 2025-06-20 | rush_delivery |
Charlie | 2025-04-05 | rush_delivery |
4- Using Functions and Calculations per Row
Goal: Apply a 10% discount if the order amount exceeds 100.
SELECT
c.customer_name,
o.amount,
d.discount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
CROSS JOIN LATERAL (
SELECT CASE
WHEN o.amount > 100 THEN o.amount * 0.1
ELSE 0
END AS discount
) d;
Result:
customer_name | amount | discount |
---|---|---|
Alice | 25 | 0 |
Alice | 120 | 12 |
Alice | 15 | 0 |
Bob | 250 | 25 |
Bob | 80 | 0 |
Charlie | 120 | 12 |
Correlated subqueries and LATERAL JOIN
s in SQL both allow a subquery to reference columns from a preceding table; however, they differ in their capabilities.
Correlated Subqueries:
- A traditional correlated subquery must return a single scalar value (one column, one row) for each row processed by the outer query.
- They are typically found in the
SELECT
list,WHERE
clause, orHAVING
clause. - It runs repeatedly for each row, which might lead to scanning the target table multiple times.
LATERAL JOIN:
- A
LATERAL JOIN
can return multiple columns and multiple rows from the subquery for each row of the outer query. - It is placed in the
FROM
clause, acting as a join between the outer table and the result of theLATERAL
subquery. - Allows PostgreSQL to plan the subquery execution more efficiently, potentially reducing table scans.
An expert knows when not to use a tool. LATERAL
is a specialized solution, and using it for simpler problems is inefficient and overly complex. Let’s introduce some use cases.
Case 1: For the “Top-1” Problem, Use DISTINCT ON
If you only need the single most recent record, DISTINCT ON
is purpose-built, more declarative, and often faster.
-- Better way to get ONLY the single most recent order
SELECT DISTINCT ON (c.customer_id)
c.customer_name,
o.order_date
FROM
customers c
JOIN
orders o ON c.customer_id = o.customer_id
ORDER BY
c.customer_id, o.order_date DESC;
Why it’s better: DISTINCT ON
is the idiomatic PostgreSQL solution for the “top-1” problem. Its intent is clearer, and the query planner can often optimize it more effectively.
Case 2: For Simple Aggregates, Use GROUP BY
If you only need a summary calculation (COUNT
, SUM
, AVG
) and not the details of the individual rows, GROUP BY
is the correct, standard tool.
-- Get the total number of orders for each customer
SELECT
c.customer_name,
COUNT(o.order_id) AS total_orders
FROM
customers c
JOIN
orders o ON c.customer_id = o.customer_id
GROUP BY
c.customer_name;
Why it’s better: This is the most fundamental and efficient way to perform aggregations. Using LATERAL
to count rows would be a convoluted anti-pattern.
LATERAL can improve performance over correlated subqueries because PostgreSQL’s query planner can optimize them better. However:
- Index Usage: Ensure indexes exist on join conditions (e.g., customer_id in the orders table).
- Data Volume: For large datasets, test performance to avoid excessive subquery executions.
- Query Complexity: Complex LATERAL subqueries may still be resource-intensive if not optimized.
Tip: Use EXPLAIN to analyze the query plan and ensure indexes are used effectively.
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
LATERAL joins in PostgreSQL are a game-changer for handling complex queries with clarity and efficiency. From extracting JSONB data to selecting top N rows per group, they offer flexibility that traditional joins and correlated subqueries often lack.
By understanding when to use LATERAL joins, optimizing performance with indexes, and avoiding overuse in simple scenarios, you can unlock their full potential.
Try experimenting with LATERAL in your next PostgreSQL project, and see how it simplifies your data challenges!
If you’re interested in exploring more, check out these articles.
- Postgres SQL FILTER: A Smarter CASE and CTE Alternative
- SQL COALESCE in Postgres: A Simple Guide
- How to Omit Multiple Keys in TypeScript
- Overloading vs. Overriding in TypeScript
Want more dev insights like this? Subscribe to get practical tips, tutorials, and tech deep dives delivered to your inbox. No spam, unsubscribe anytime.