Datasets in Logical Data Models
In a logical data model, a dataset is the basic organizational unit, a set of related facts, attributes or both, which are stored together in the workspace.
Datasets are associated with each other through relationships. A relationship describes how one dataset is related another. The relationship is important because it determines what you “slice” by what when building your own metrics using MAQL - Analytical Query Language.
A relationship joins the two datasets through a single connection point. A connection point functions like a database primary key; it should identify the field in the originating dataset that contains information to uniquely identify the data in other fields in the dataset.
When a relationship is made between an attribute and a fact dataset, a foreign key field is inserted into the target dataset. This foreign key is populated by references to the primary key values in the dataset at the other end of the relationship. For more information, see Connection Points in Logical Data Models.
GoodData also supports a special data model object for managing time-based data. The Date dataset can be added to your workspaces to manage attribute information and to enable aggregation at the day, week, month, quarter, and year level. To learn more about date datasets, see Dates in Logical Data Models.
In the modeler, the colors help you focus on particular component types.
- green A dataset that contains at least one fact, which is numerical data stored for purposes of creating metrics in the workspace.
- orange A dataset that contains attributes, which are numerical or text-based fields used to slice data in the workspace.
- blue GoodData also supports a special data model object for managing time-based data. The Date dataset can be added to your workspace to manage attribute information and to enable aggregation at the day, week, month, quarter, and year level.
Example: Datasets in the GoodData LDM modeler
The following logical data model example created in the LDM Modeler displays six datasets with data from the GoodData Demo Workspace.
Fact datasets (green)
- Order Lines primary key - Order Line ID references
- Campaign/Channels primary key - Campaign Channel ID
Attribute datasets (orange)
- Customers primary key - Customer ID
- Products primary key - Product ID
- Campaigns primary key - Campaign ID
Date dataset (blue)
- Date
The direction of the arrow determines which dataset’s data can be analyzed (sliced) by the data from the other dataset. For example, in the sample model, the relationship between the Customer
and Order Lines
datasets allows you to slice Quantity
by Customer Name
.