Datasets in Logical Data Models
In a logical data model, a dataset is the basic organizational unit, a set of related facts, attributes or both, which are stored together in the workspace.
Datasets are associated with each other through relationships. A relationship is a data model object used to describe how one dataset is related another. The relationship is important because it determines what you "slice" by what when building your own metrics using MAQL - Analytical Query Language.
A relationship joins the two datasets through a single connection point. A connection point functions like a database primary key; it should identify the field in the originating dataset that contains information to uniquely identify the data in other fields in the dataset.
When a relationship is made between an attribute and a fact dataset, a foreign key field is inserted into the target dataset. This foreign key is populated by references to the primary key values in the dataset at the other end of the relationship.
GoodData also supports a special data model object for managing time-based data. The Date dataset can be added to your projects to manage attribute information and to enable aggregation at the day, week, month, quarter, and year level. To learn more about date datasets, see Dates in Logical Data Models.
In the modeler, the colors help you focus on particular component types.
- green
A dataset that contains at least one fact, which is numerical data stored for purposes of creating metrics in the project. - orange
A dataset that contains attributes, which are numerical or text-based fields used to slice data in the project. - blue
GoodData also supports a special data model object for managing time-based data. The Date dataset can be added to your projects to manage attribute information and to enable aggregation at the day, week, month, quarter, and year level.
Example: Datasets in GoodData's LDM modeler
The following logical data model example, created in the LDM Modeler, displays six datasets with data from the GoodData Demo Workspace.
Fact datasets (green)
- Order Lines
primary key - Order Line ID
references - Campaign/Channels
primary key - Campaign Channel ID
Attribute datasets (orange)
- Customers
primary key - Customer ID - Products
primary key - Product ID - Campaigns
primary key - Campaign ID
Date dataset (blue)
- Date
The direction of the arrow determines which dataset's data can be analyzed (sliced) by the data from the other dataset. For example, in the sample model, the relationship between the Customer and Order Lines datasets allows you to slice Quantity by Customer Name.
Thank you for your feedback!
Thank you for your feedback!
If you can't find what you need, don't hesitate to send us a comment.
Any questions?
Check out the GoodData community.