Datasets in Logical Data Models

In a logical data model, a dataset is the basic organizational unit, a set of related facts, attributes or both, which are stored together in the workspace.

Datasets are associated with each other through relationships. A relationship is a data model object used to describe how one dataset is related another. The relationship is important because it determines what you "slice" by what when building your own metrics using MAQL - Analytical Query Language.

A relationship joins the two datasets through a single connection point. A connection point functions like a database primary key; it should identify the field in the originating dataset that contains information to uniquely identify the data in other fields in the dataset.

When a relationship is made between an attribute and a fact dataset, a foreign key field is inserted into the target dataset. This foreign key is populated by references to the primary key values in the dataset at the other end of the relationship.

GoodData also supports a special data model object for managing time-based data. The Date dataset can be added to your projects to manage attribute information and to enable aggregation at the day, week, month, quarter, and year level. To learn more about date datasets, see Dates in Logical Data Models.

In the modeler, the colors help you focus on particular component types.

  • green
    A dataset that contains at least one fact, which is numerical data stored for purposes of creating metrics in the project.
  • orange
    A dataset that contains attributes, which are numerical or text-based fields used to slice data in the project.
  • blue
    GoodData also supports a special data model object for managing time-based data. The Date dataset can be added to your projects to manage attribute information and to enable aggregation at the day, week, month, quarter, and year level.

Example: Datasets in GoodData's LDM modeler

The following logical data model example, created in the LDM Modeler, displays six datasets with data from the GoodData Demo Workspace.

Fact datasets (green)

  • Order Lines
    primary key - Order Line ID
    references
  • Campaign/Channels
    primary key - Campaign Channel ID

Attribute datasets (orange)

  • Customers
    primary key - Customer ID
  • Products
    primary key - Product ID
  • Campaigns
    primary key - Campaign ID

Date dataset (blue)

  • Date

The direction of the arrow determines which dataset's data can be analyzed (sliced) by the data from the other dataset. For example, in the sample model, the relationship between the Customer and Order Lines datasets allows you to slice Quantity by Customer Name.

Powered by Atlassian Confluence and Scroll Viewport.