How the FILTER function affects your report's performance

Introduction

Hey everyone! Today, let's research about the FILTER function in DAX. It's basically the "bread and butter" for anyone starting out with Power BI because it's just so intuitive. Most of us, when we first touch DAX, end up writing something like this:
CALCULATE(SUM(SALESTABLE[REVENUE]),FILTER(COUNTRY,COUNTRY = "USA"))
But did you know there are two other ways to get the same result that are much "cleaner"?
  • Using Boolean:
        CALCULATE(SUM(SALESTABLE[REVENUE]),COUNTRY = "USA")
  • Using KEEPFILTERS:
        CALCULATE(SUM(SALESTABLE[REVENUE]),KEEPFILTERS(COUNTRY = "USA"))
So, why do we have so many options? And what exactly is happening "under the hood" of Power BI when we use FILTER? Let's dive in and find out!

Keywords

BI - Power BI, Tuning tool - DAX Studio, Database - Databricks (DirectQuery), Function focus - FILTER

Methodology & Dataset

Methodology

To figure out which method is the performance king, I'm going to compare these 3 DAX formulas based on:
  1. Physical Query Plan in DAX Studio: to see how the engine "think"
  2. Number of queries sent back to Databricks: generally, fewer and lighter queries mean a snappier report.
  3. Time to complete the visual: the objective is fast and faster
  4. The complexity of queries: in this perspective, simple is the best
Note that, the table mode used in this research is DirectQuery.

Dataset

I'm using a sample dataset from the Databricks Free Edition with three tables:
dim: sales_customers
dim: sales_franchises
fact: sales_transactions
The relationship among 3 tables is described as the diagram belows:
And this table is our target result:

Performance Showdown (Tuning)

1. Using FILTER

DAX Formula:
USA Transaction = CALCULATE([Transaction],FILTER(sales_customers,sales_customers[country]="USA"))
How it works: Here, FILTER acts as an iterator. It literally goes through the sales_customers table row by row to check if the country is "USA."
Physical Query Plan:
Look at the line 10, we have #FieldCols=2 -> They are 2 columns: franchises name and city for out table visual. #ValueCols=1 -> It's the measure [USA Transaction] that we are tuning. In this query plan, it took 17 steps and retrieved 968 records.
SQL query sent to the database:
Result: It sends 1 query to the database, 17 steps, 968 rows retrieved, 1,265 ms
The Semijoin you see in the SQL trace is the tax you pay for using the FILTER function incorrectly. By forcing Power BI to create a virtual table (semijoin1), you're essentially telling the database to work twice as hard.

2. Using Boolean

DAX Formula:
USA Transaction = CALCULATE([Transaction],sales_customers[country]="USA")
How it works: Power BI is smart. It internally converts this into:
FILTER(ALL(sales_customers[country]), sales_customers[country]="USA"). Instead of scanning the whole table, it only looks at the "country" column.
Physical Query Plan:

SQL query sent to the database:
Result: It sends 1 query to the database, 11 steps, 501 rows retrieved, 854 ms
Switching to a Boolean filter allows the engine to perform Filter Pushdown, moving that 'USA' condition directly into the main WHERE clause and eliminating the need for that costly join entirely.

3. Using KEEPFILTERS

DAX Formula:
USA Transaction = CALCULATE([Transaction],KEEPFILTERS(sales_customers[country]="USA")
How it works: This one is the "polite" version. It doesn't overwrite existing filters from slicers; instead, it plays nice and combines (ANDs) with them.
Physical Query Plan:
SQL sent to the database:
Result: Similar to the Boolean approach.

Discussion: Which one should you use?

In summary, Boolean Filters/ KEEPFILTERS outshine the FILTER function in DirectQuery environments. While the FILTER function creates more execution steps, resource-heavy Virtual Tables, triggers complex SQL, higher CPU overhead and longer running time -> This makes the report load longer and increases costs.

-> Stop overusing FILTER on whole tables: The FILTER function is an Iterator. In a DirectQuery environment, it often forces the engine to create a "Virtual Table" before the final calculation. Unless you need to filter based on complex logic across multiple columns or need specific row-context calculations, avoid FILTER(Table, condition). It forces Power BI to "chew" through the entire table, which is exhausting for the engine!
-> Boolean is your best friend: For simple conditions (like Country = "USA"), just write it directly in CALCULATE. The code is cleaner, faster, simpler SQL query and Power BI can optimize by looking only at the necessary columns.
-> Use KEEPFILTERS to preserve context: If you want to make sure your results don't act weird when users start clicking on other slicers, KEEPFILTERS is a safe bet and still performs great the same as Boolean.
Hope this helps you speed up those heavy reports!