Postgres SQL FILTER: A Smarter CASE and CTE Alternative

Unlock cleaner, faster, and more readable queries. Learn how the FILTER clause revolutionizes conditional aggregation in PostgreSQL and why it outperforms traditional methods like CASE and CTEs.

Introduction

When you build reports in SQL, you often need to count or average values based on a condition, like counting only completed orders or averaging amounts for canceled ones.

For years, SQL developers relied on CASE statements or complex Common Table Expressions (CTEs) to handle these scenarios. They work, but it gets messy fast.

There’s a better way.

PostgreSQL (and some other databases) support the FILTER clause, a clean and efficient feature for conditional aggregation. In this guide, you’ll learn how to use it to simplify your queries, boost performance, and avoid common mistakes.

The Business Problem: An E-commerce Dashboard

Imagine you run an online store. To understand your business’s health, you need a monthly report that shows:

The total revenue from all orders.
The revenue generated specifically from “completed” orders.
The revenue lost from “cancelled” orders.

Getting this data in one clean report requires us to aggregate sums based on the status of each order.

Setting Up Our Example

First, let’s create the tables we need. We’ll have a customers table and an orders table.

-- Create a table for our customers
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(100),
    join_date DATE
);

-- Create a table for orders
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    order_status VARCHAR(20), -- e.g., 'completed', 'pending', 'cancelled'
    amount DECIMAL(10, 2),
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

-- Insert some sample data to work with
INSERT INTO customers (customer_id, customer_name, join_date) VALUES
(1, 'John Smith', '2023-01-15'),
(2, 'Jane Doe', '2023-02-20');

INSERT INTO orders (order_id, customer_id, order_date, order_status, amount) VALUES
(101, 1, '2024-05-05', 'completed', 150.00),
(102, 2, '2024-05-10', 'completed', 200.50),
(103, 1, '2024-05-12', 'cancelled', 75.25),
(104, 2, '2024-06-01', 'completed', 300.00),
(105, 1, '2024-06-03', 'pending', 50.00),
(106, 2, '2024-06-08', 'cancelled', 120.75),
(107, 1, '2024-06-15', 'completed', 250.00);

-- Create a table for our customers
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(100),
    join_date DATE
);

-- Create a table for orders
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    order_status VARCHAR(20), -- e.g., 'completed', 'pending', 'cancelled'
    amount DECIMAL(10, 2),
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

-- Insert some sample data to work with
INSERT INTO customers (customer_id, customer_name, join_date) VALUES
(1, 'John Smith', '2023-01-15'),
(2, 'Jane Doe', '2023-02-20');

INSERT INTO orders (order_id, customer_id, order_date, order_status, amount) VALUES
(101, 1, '2024-05-05', 'completed', 150.00),
(102, 2, '2024-05-10', 'completed', 200.50),
(103, 1, '2024-05-12', 'cancelled', 75.25),
(104, 2, '2024-06-01', 'completed', 300.00),
(105, 1, '2024-06-03', 'pending', 50.00),
(106, 2, '2024-06-08', 'cancelled', 120.75),
(107, 1, '2024-06-15', 'completed', 250.00);

Our goal is to create the following monthly summary from this data.

month	total_revenue	completed_revenue	cancelled_revenue
2024-05	425.75	350.50	75.25
2024-06	720.75	550.00	120.75

Solution 1: Using the CASE Statement

The most common method for PostgreSQL conditional count and sum was the CASE statement inside an aggregate function.

-- This query uses CASE to conditionally sum the amounts.
-- It works, but it can get messy.
SELECT
    TO_CHAR(order_date, 'YYYY-MM') AS month,
    -- Sum all orders for total revenue
    SUM(amount) AS total_revenue,
    -- Only sum if the status is 'completed'
    SUM(CASE WHEN order_status = 'completed' THEN amount ELSE 0 END) AS completed_revenue,
    -- Only sum if the status is 'cancelled'
    SUM(CASE WHEN order_status = 'cancelled' THEN amount ELSE 0 END) AS cancelled_revenue
FROM
    orders
GROUP BY
    month
ORDER BY
    month;

-- This query uses CASE to conditionally sum the amounts.
-- It works, but it can get messy.
SELECT
    TO_CHAR(order_date, 'YYYY-MM') AS month,
    -- Sum all orders for total revenue
    SUM(amount) AS total_revenue,
    -- Only sum if the status is 'completed'
    SUM(CASE WHEN order_status = 'completed' THEN amount ELSE 0 END) AS completed_revenue,
    -- Only sum if the status is 'cancelled'
    SUM(CASE WHEN order_status = 'cancelled' THEN amount ELSE 0 END) AS cancelled_revenue
FROM
    orders
GROUP BY
    month
ORDER BY
    month;

This query gets the job done. However, for every conditional sum, you have a CASE...WHEN...THEN...ELSE...END block. If you have many conditions, the SELECT list becomes long, hard to read, and difficult to maintain.

Solution 2: Using Common Table Expressions (CTEs)

Another approach is to use CTEs to pre-aggregate each status separately and then join them.

-- This query uses CTEs, which is even more verbose.
-- It breaks the logic into steps, but is much longer.
WITH monthly_totals AS (
    SELECT
        TO_CHAR(order_date, 'YYYY-MM') AS month,
        SUM(amount) AS total_revenue
    FROM orders
    GROUP BY month
),
completed AS (
    SELECT
        TO_CHAR(order_date, 'YYYY-MM') AS month,
        SUM(amount) AS completed_revenue
    FROM orders
    WHERE order_status = 'completed'
    GROUP BY month
),
cancelled AS (
    SELECT
        TO_CHAR(order_date, 'YYYY-MM') AS month,
        SUM(amount) AS cancelled_revenue
    FROM orders
    WHERE order_status = 'cancelled'
    GROUP BY month
)
SELECT
    mt.month,
    mt.total_revenue,
    COALESCE(c.completed_revenue, 0) AS completed_revenue,
    COALESCE(cn.cancelled_revenue, 0) AS cancelled_revenue
FROM
    monthly_totals mt
LEFT JOIN completed c ON mt.month = c.month
LEFT JOIN cancelled cn ON mt.month = cn.month
ORDER BY
    mt.month;

-- This query uses CTEs, which is even more verbose.
-- It breaks the logic into steps, but is much longer.
WITH monthly_totals AS (
    SELECT
        TO_CHAR(order_date, 'YYYY-MM') AS month,
        SUM(amount) AS total_revenue
    FROM orders
    GROUP BY month
),
completed AS (
    SELECT
        TO_CHAR(order_date, 'YYYY-MM') AS month,
        SUM(amount) AS completed_revenue
    FROM orders
    WHERE order_status = 'completed'
    GROUP BY month
),
cancelled AS (
    SELECT
        TO_CHAR(order_date, 'YYYY-MM') AS month,
        SUM(amount) AS cancelled_revenue
    FROM orders
    WHERE order_status = 'cancelled'
    GROUP BY month
)
SELECT
    mt.month,
    mt.total_revenue,
    COALESCE(c.completed_revenue, 0) AS completed_revenue,
    COALESCE(cn.cancelled_revenue, 0) AS cancelled_revenue
FROM
    monthly_totals mt
LEFT JOIN completed c ON mt.month = c.month
LEFT JOIN cancelled cn ON mt.month = cn.month
ORDER BY
    mt.month;

This approach is extremely verbose. It requires multiple scans of the orders table (though the optimizer might fix this) and complex joins. It’s simply too much code for such a common task.

The Better Way: The FILTER Clause

Now, let’s solve the same problem using the FILTER clause.

-- This is the modern, clean, and efficient way.
-- Notice how much easier it is to read.
SELECT
    TO_CHAR(order_date, 'YYYY-MM') AS month,
    -- Sum all orders for total revenue
    SUM(amount) AS total_revenue,
    -- The FILTER clause applies a WHERE condition to the aggregation
    SUM(amount) FILTER (WHERE order_status = 'completed') AS completed_revenue,
    SUM(amount) FILTER (WHERE order_status = 'cancelled') AS cancelled_revenue
FROM
    orders
GROUP BY
    month
ORDER BY
    month;

-- This is the modern, clean, and efficient way.
-- Notice how much easier it is to read.
SELECT
    TO_CHAR(order_date, 'YYYY-MM') AS month,
    -- Sum all orders for total revenue
    SUM(amount) AS total_revenue,
    -- The FILTER clause applies a WHERE condition to the aggregation
    SUM(amount) FILTER (WHERE order_status = 'completed') AS completed_revenue,
    SUM(amount) FILTER (WHERE order_status = 'cancelled') AS cancelled_revenue
FROM
    orders
GROUP BY
    month
ORDER BY
    month;

The result is identical, but the query is dramatically cleaner. The FILTER (WHERE ...) syntax is attached directly to the SQL aggregate function FILTER, making the intent crystal clear: “Sum the amount, but only for rows that meet this condition.” This makes it perfect for creating SQL pivot with filter-like reports.

FILTER vs. CASE: A Closer Look

Readability is a huge win, but what about performance?

In the Postgres FILTER vs CASE debate, according to jOOQ’s benchmark, FILTER often has a performance edge. While logically similar, FILTER provides a more direct instruction to the PostgreSQL query planner.

Why? The CASE statement is a general-purpose expression that can be used anywhere, so the optimizer has to work harder to understand its context within an aggregate function. The FILTER clause, however, was designed specifically for this purpose, allowing for a more direct and optimized execution path.

FILTER vs. CTEs

When it comes to CTE vs FILTER, the choice is even clearer for conditional aggregation. The CTE approach requires multiple subqueries and joins, which adds significant overhead both in terms of code length and planner complexity. The FILTER clause accomplishes the same goal in a single, streamlined scan of the table, making it vastly more efficient and maintainable.

Using FILTER with Multiple Conditions

What if you need to check multiple criteria? The FILTER clause supports standard AND and OR logic inside its WHERE clause, making a PostgreSQL filter multiple conditions check straightforward.

-- Example of filtering on multiple conditions
SELECT
    COUNT(*) FILTER (WHERE order_status = 'completed' AND amount > 200) AS large_completed_orders,
    COUNT(*) FILTER (WHERE order_status = 'pending' OR order_status = 'shipped') AS in_progress_orders
FROM
    orders;

-- Example of filtering on multiple conditions
SELECT
    COUNT(*) FILTER (WHERE order_status = 'completed' AND amount > 200) AS large_completed_orders,
    COUNT(*) FILTER (WHERE order_status = 'pending' OR order_status = 'shipped') AS in_progress_orders
FROM
    orders;

Conclusion

The PostgreSQL FILTER clause is more than just syntactic sugar. It is a powerful tool for writing cleaner, more readable, and more performant SQL. By allowing you to apply specific conditions directly to aggregate functions, it solves the problem of PostgreSQL’s conditional aggregation in a way that is far superior to older methods.

For any data analyst or developer working with PostgreSQL, mastering the FILTER clause is essential. It simplifies complex reports, speeds up queries, and makes your SQL code a joy to read and maintain. The next time you find yourself writing a CASE statement inside a SUM() or COUNT(), give FILTER a try. You won’t look back.

Finally, if you found this article helpful, you can check more here:

Want more dev insights like this? Subscribe to get practical tips, tutorials, and tech deep dives delivered to your inbox. No spam, unsubscribe anytime.