In a logical data model, a dataset is the basic organizational unit, a set of related facts, attributes or both, which are stored together in the workspace.
Datasets are associated with each other through relations. A relation is a data model object used to describe a relationship between one dataset and another. That relationship is important because it determines what you 'slice' by what when building your own metrics using MAQL - Analytical Query Language.
A relation joins the two datasets through a single connection point. A connection point functions like a database primary key; it should identify the field in the originating dataset that contains information to uniquely identify the data in other fields in the dataset.
When a relation is made between an attribute and a fact dataset, a foreign key field is inserted into the target dataset. This foreign key is populated by references to the primary key values in the dataset at the other end of the relation.
GoodData also supports a special data model object for managing time-based data. The Date dataset can be added to your projects to manage attribute information and to enable aggregation at the day, week, month, quarter, and year level. To learn more about date datasets, see Dates in Logical Data Models.
In the modeler, the colors help you focus on particular component types.
A dataset that contains at least one fact, which is numerical data stored for purposes of creating metrics in the project.
A dataset that contains attributes, which are numerical or text-based fields used to slice data in the project.
GoodData also supports a special data model object for managing time-based data. The Date dataset can be added to your projects to manage attribute information and to enable aggregation at the day, week, month, quarter, and year level.
Example: Datasets in GoodData's LDM modeler
The following logical data model example, created in the LDM Modeler, displays six datasets with data from the GoodData Demo Workspace.
Fact datasets (green)
- Order Lines
primary key - Order Line ID
primary key - Campaign Channel ID
Attribute datasets (orange)
primary key - Customer ID
primary key - Product ID
primary key - Campaign ID
Date dataset (blue)
The direction of the arrow determines which dataset's data can be analyzed (sliced) by the data from the other dataset. For example, in the sample model, the relation between the Customer and Order Lines datasets allows you to slice Quantity by Customer Name.