As my experience goes, there is one option in PowerBI, that people rarely use. But for some of us, especially with some database performance tuning background, this is quite an important topic. For one very important reason – turning it on can make your direct queries to the engine faster. Here are some important details before we continue:
- Assume referential integrity (ARI) is set on Relationship level (details here)
- ARI works only in DirectQuery mode
- ARI can potentially speed up queries by converting LEFT OUTER JOINs to INNER JOINs
- ARI can generate inconsistencies (details here)
- By default ARI is turned off on all relationships
The links above describe the HOW and what are the pros and cons about setting ARI. However, I do have several years of database and performance tuning background, so I wanted to see how this looks under the covers. So I fired up my management studio and started some tracing.
I am working with Adventure Works database and my model looks like this:
Respectively, if I create a visual, that takes the Sum of TaxAmt and slices it by Sales Territory Name and Customer Account Number, the query with ARI turned off looks like this:
The moment ARI is turned on, all queries, that involve tables, that have Active relationship between each other, automatically get INNER JOIN generated by PBI and sent to the database engine. In my model, when I turned ARI on, I automatically got some improvement in the cost of the query (this is the actual execution plan, where you can see the difference):
If you are interested in performance tuning, you can definitely dig inside and try to optimize things from the database end, however, there is not much you can do on how queries are generated. In any case, you must be aware of how ARI works, what are its complications and potential benefits. And be very careful about consistency of your results!