Introduction
In the world of optimizations, DirectQuery has limited amount of optimizations, we are going to talk about one method to optimize DirectQuery mode in this article. Power BI's automatic aggregations offer a game-changing solution, helping businesses optimize query speeds while maintaining data accuracy.What Are Automatic Aggregations?
Automatic aggregations in Power BI allow the system to intelligently create aggregated tables that improve query performance. This means that instead of scanning an entire dataset for each query, Power BI can pull summarized data from predefined aggregations, making dashboards load faster and reducing computational strain.Prerequisites
Supported data sources
Automatic aggregations are supported for the following data sources:
- Azure SQL Database
- Azure Synapse Dedicated SQL pool
- SQL Server 2019 or later
- Google BigQuery
- Snowflake
- Databricks
- Amazon Redshift
Permissions
To enable and configure automatic aggregations, you must be the Model owner. Workspace admins can take over as owner to configure automatic aggregations settings and the workspace should be in a premium capacity.
Why Should You Use Automatic Aggregations?
- Performance Optimization: Queries run significantly faster, improving the overall user experience.
- Lower Costs: Reduced computing power translates to savings in memory and processing expenses.
- Effortless Scaling: As datasets grow, automatic aggregations ensure smooth performance without manual intervention.
How Automatic Aggregations Work?
Power BI analyses the query workload and determines which aggregations will be most beneficial. It then creates summary tables based on commonly accessed query patterns. The system keeps the data updated, ensuring that users see relevant and accurate insights without needing to manually configure aggregation logic.When working with DirectQuery in Power BI, every time a user opens a report or interacts with a visualization, the system generates Data Analysis Expressions (DAX) queries, which then translate into SQL queries sent to the backend data source. This process requires the source to calculate and return results for each query, which can be time-consuming and resource-intensive compared to the in-memory efficiency of import mode models. However, enabling automatic aggregations for a DirectQuery model can significantly improve query performance by minimizing these frequent round trips to the data source. Instead of repeatedly fetching raw data, Power BI intelligently retrieves pre-aggregated results from an in-memory cache, allowing reports to load faster while reducing the computational burden on backend systems. Since this cache holds only a fraction of the original dataset's size, it smartly balances performance gains with storage efficiency. The beauty of this feature lies in its ability to seamlessly return results while ensuring that only queries requiring data beyond the cached aggregations get pushed to the source—creating a smoother, faster experience for users without compromising analytical depth
Setting Up Automatic Aggregations
To enable automatic aggregations in Power BI:
- Enable Aggregations: Configure the dataset for automatic aggregation in the dataset settings settings.
- Schedule refresh : Configure scheduled refresh on the dataset to train the model
How to improve the Aggregation process?
Adjusting the query coverage percentage
Increasing this percentage allows more queries to be answered from the cache, but it can lead to longer training and refresh times. By default, 75% of report queries in Power BI use the in-memory aggregations cache, improving performance by reducing trips to the backend data source. Lowering it speeds up these processes and reduces resource usage, but may slow down report visualization as more queries require direct access to the data source. To ensure optimal cache efficiency, allow multiple training/refresh iterations, so the system can analyse query patterns over time. If refreshes are scheduled daily, consider waiting a week to account for variations in user reporting behaviour. Adjustments should be carefully timed to balance speed, resource usage, and query responsiveness.
Estimate query performance impact
The Query Performance Impact Lift chart estimates report query run times based on the percentage of queries using the cached aggregations. Initially, it displays zero values until at least one training/refresh operation occurs. After this initial step, the chart helps identify whether adjusting the percentage of queries relying on the in-memory cache could further enhance performance. A threshold marker on the chart indicates the target query response time, enabling users to fine-tune the cache percentage to achieve the desired balance between speed and resource efficiency.Final Thoughts
Automatic aggregations in Power BI help businesses streamline operations, enhance performance, and optimize costs. Whether working with complex financial data, sales insights, or customer trends, automatic aggregations ensure that users get the information they need fast and efficiently.
for more information, refer to this link