Project performance, meaning the time required to render reports and dashboards, hinges on the design of the logical data model and metric complexity. Following the best practices described in this article, you will be able avoid the pitfalls most often responsible for performance degradation. This article will also discuss how to use the new LDM Designer in CloudConnect to apply the necessary model changes.
- Large tables
Size is determined by both column and row count
How to check: Create a simple metric counting rows in your data set [SELECT COUNT (connection point attribute)]. Use this metric in a report and apply random filters (this will prevent cache hits). If this trivial report exceeds reasonable runtimes, find a way to reduce the table size.
- Table joins
Joining one or more large data sets adds to query complexity
How to check: Create a simple metric counting rows in your data set [Select Count (connection_point_attribute)]. Use this metric in a two reports, one sliced by an attribute from the same data set and one sliced by an attribute from the joined data set. If the runtime of the report that requires the join is significantly longer than the one without, investigate methods for eliminating the join.
- Multi-query reports
Reports that display metrics computed from independent data sets require multiple queries
How to check: Create a report with two metrics based on facts in different data sets and slice by a shared attribute. Measure the report load time with both metrics and then rerun the report with only one of the metrics. If the report including both metrics requires a significantly longer run-time, try splitting metrics across multiple reports.
A unique subtotal type that requires re-computation of queries based on report attribute values
How to check: Identify a report that utilizes Rollups as a subtotal type and requires excessive runtime. Run the report with no Rollup to determine if runtime is reduced.
Reduce Data Volume
Change the table granularity by aggregating over time periods or over attributes. Table size is dependent on column and row count, so minimize size by excluding unnecessary fields or utilizing attribute labels, i.e. UserName for UserID. Volume reduction in fact tables is also possible by storing data for only a specific time period, i.e. the last 6 months or past year.
Minimize Joins by Denormalizing
Queries suffer performance penalties as a result of table joins. Furthermore, these penalties correlate with table size so the larger the tables, the longer the queries take to run. Denormalize, meaning merge related fields into the same data set, to expedite query execution.
Simplify Reports Multiple Queries
Reports designed to display multiple metrics calculated from different data sets will require multiple table scans. If report rendering times exceed desired times, troubleshoot by splitting the metrics into separate reports and running separately. Similarly, applying attribute filters to metrics calculated from different data sets will require multiple table scans.
Subtotal computations demand little with one exception, the Rollup. Other subtotal functions like average, sum, and minimum can be computed from a single query; however, rollups, which are typically used for non-additive facts, require recomputing metrics at the attribute level. This added complexity can require additional table scans and decrease rendering speeds.
Using the LDM Modeler
If you already use GoodData for your reporting needs and want to update your existing data model, Release 87 of CloudConnect provided this functionality. To perform edits, follow this protocol:
- Enter project in CloudConnect
- Create new logical data model
- Import data model from existing GoodData project
Click and drag attributes/facts to move between data sets
Ex: Moving Fact 2 from Data Set 1 to Data Set 2
- Add or remove facts, attributes, data sets or dates
- Select Publish model to server
Ex: Fact 2 Moved from Data Set 1 to Data Set 2 and Attribute E added to Data Set 2
- Review changes
Low Impact - Non-destructive changes, typically additions or name changes
High Impact - Model alterations that affect referential integrity or are destructive
DataPerforms updates without changing the project’s existing data.
Updates model without regarding data. Data discarded as needed.