Page tree
Skip to end of metadata
Go to start of metadata

Contents:

You can work with a dimensional model using the GoodData report building user interface. You have your data in a table (for example, an Excel spreadsheet, SQL result set, Salesforce entities etc), and an analytical engine turns your columns into facts and attributes. Thus facts can be summed and averaged, attributes can be counted, and all these measures can be sliced or filtered by the attribute columns.

Example:

The table contains support ticket data. It can be represented as a star schema with:

  • Fact columns for “Replies”, “Time to 1st Reply” and “Time to Resolve”
  • The “Ticket” connection point
  • Attributes for “User”, “Employee” and “Status”

GoodData uses a general term “data set” for a group of attributes and facts that were created from a single table. This article also uses the term “data set”, as well as “fact table” or “dimension” terms to indicate the data set characteristics.

Example:

You can connect one data set to another if the underlying data form a natural hierarchy.

You can add:

  • A new dimension with “User” connection point and “Account” attribute and connect it to “Ticket”.
  • Date dimensions. For example, two date dimensions, “Created” and “Closed”, which include attributes like “Week”, “Month”, “Year” and so on.

Many to Many Relations

You can also analyze data that do not form a hierarchy.

Example:

Tickets are organized using tags. A ticket can be tagged with multiple tags and multiple tickets can be tagged with a single tag. The resulting logical data model may look like this:

The facts are not referenced by tags directly and you cannot automatically slice and filter by the Tag attribute. For example, you cannot determine the time to resolve of just those tickets tagged urgent.

The ticket facts can be sliced by the Tag attribute with a special metric that computes the aggregation for every row in the Ticket Tag Association table. In other words, it treats the association table as a fact table with the ticket specific metric value (for example, average resolution time) duplicated for each ticket in that table. This solution is described in the Modeling Many-to-Many relationships in GoodData article.

To filter by a tag, you include only tickets that have at least one record in the association table that points to the selected tag.

This condition can be expressed by the following MAQL expression:

... WHERE (SELECT COUNT(Ticket Tag Association) 
	BY Ticket, ALL OTHER) > 0

For example, to show an average time to first reply and to reflect a dashboard filter based on the Tag attribute:

SELECT AVG(Time to 1st Reply) 
	WHERE (SELECT COUNT(Ticket Tag Association)
		BY Ticket, ALL OTHER) > 0

You can also implement this filtering rule by creating a separate metric such as SELECT COUNT(Ticket Tag Association) BY Ticket, ALL OTHER and use that metric in a report level numeric range filter. This solution allows an analyst to run ad hoc reports using the report building page without being limited by the pre-built dashboard reports.

To make this technique fully functional, ensure that every ticket has at least one tag, otherwise untagged tickets are excluded from all computations when no tag was selected - this is because the COUNT submetric always returns zero for untagged tickets.

To do this, add a special empty tag value before loading data into GoodData (i.e., during ETL) that associates untagged tickets to this extra tag. This ensures that untagged tickets are included in computations and end users are able to filter for untagged tickets just by selecting that empty item from the Tag filter drop-down.

Independent Fact Tables

The most typical example of a multi-hierarchical model is a situation when the reporting requirements lead to multiple independent fact tables. When analyzing a survey results, the model usually looks like this:


For a hotel customer satisfaction survey, the Choice Responses data set may include responses to questions such as “What was the purpose of stay?” with options such as “Business”, “Leisure”. Rating Responses may include a response to Questions like “What was your overall satisfaction?”

To create a report that compares the overall satisfaction between “Business” and “Leisure” customers:

SELECT AVG(Rating) WHERE Question = 'What was your overall satisfaction?' AND 
	(SELECT COUNT(Choice Responses)
		BY Survey Response, ALL OTHER
		WHERE Question = 'What was the purpose of stay?'
			AND Option = 'Business')
	> 0

For better readability, split the metric into a number of separate metrics, such as Overall Satisfaction and # Purpose of Stay Responses:

SELECT Overall Satisfaction
	WHERE (SELECT # Purpose of Stay BY Survey Response, ALL OTHER
			WHERE Option = 'Business') > 0

You can exclude the Option = ‘Business’ expression and leave the “Purpose of Stay” filter up to a dashboard filter.

Filtering by a sub metric is not specific to COUNT sub-metrics. You can use the same technique for reports that compute a number of unsatisfied business travelers.

For example, you can count the survey responses from those whose purpose of stay is business and whose overall satisfaction rating is below 7 out of 10:

SELECT COUNT(Survey Response, Choice Responses)
WHERE (SELECT AVG(Rating) BY Survey Response, 
ALL OTHER WHERE Question = 'What was your overall satisfaction?') < 7