Page tree
Skip to end of metadata
Go to start of metadata

For information about what Fact Table Grain is, see Set the Grain of a Fact Table to Avoid Duplicate Records.

For information about working with Fact Table Grain via the CloudConnect LDM Modeler, see this article.

Setting the Fact Table Grain in an Existing Project

Setting grain instead of connection points in an existing project will consist of the following steps:

  1. Get the identifier of the attribute with the connection point you want to replace by grain. It usually looks like {attr.datasetname.attributename}. Then get the identifier of the connection point itself which is of form {label.datasetname.labelname}. Both identifiers can be obtained from the LDM modeler or from the REST API resource for the appropriate attribute. Learn more about Acquiring identifiers.
  2. Get identifiers of the attributes you want to add to the grain. They must be directly connected to the first one, i.e. there is an edge between them or they are part of the same dataset.
  3. Construct MAQL DDL script:
    1. Drop the connection point.
    2. Add grain definition to the attribute from the first step.
    3. Synchronize the dataset.
  4. Adjust your ETL. You do not have to generate the connection point data, so you may remove that code. This would add some additional performance benefit, however it is not mandatory. The ETL would work even if you did not remove that.

Fact Table Grain Implementation Examples

MAQL DDL syntax

Create or set grain for an attribute

CREATE|ALTER ATTRIBUTE {attr.myfacttable.factsof} ... [SET GRAIN {attr.mydimension.myattr1}, {attr.myfacttable.myattr2}, ...];

SYNCHRONIZE {dataset.myfacttable} [PRESERVE DATA];


Remove grain

ALTER ATTRIBUTE {attr.myfacttable.factsof} UNSET GRAIN;

SYNCHRONIZE {dataset.myfacttable} [PRESERVE DATA];

 

{attr.myfacttable.factsof}

The attribute identifier that represents the fact table. This field is created automatically when you create a fact table, and acts as its primary key. Note that your fact table must not be referenced from any other datasets. There must also be no connection point: when you navigate to manage section and look for this “Records of” attribute, the list of labels must be empty.

 

SET GRAIN {attr.mydimension.myattr1}, {attr.myfacttable.attr2}, …

Specify attributes to define the grain of {attr.myfacttable.factsof}. These attributes become the new compound key for the fact table. The attributes must always occur in a unique combination for the entire fact table, and must be directly related to {attr.myfacttable.factsof} in the LDM.

It is possible to use attributes from the dataset of the fact table (i.e. {dataset.myfacttable}) or connection points of directly connected dataset or dimension, sometimes called references, including references to date dimensions. See 'Example 2. Existing LDM development: References in fact table grain' below.

Alter existing grain definition

If you must change the grain definition (for example, due to changes in your internal processes), check the current settings on the API and construct a MAQL DDL ALTER command.

Read existing GRAIN settings

To uncover the list of attributes forming the grain of a fact table, contact GoodData REST API at URI /gdc/md/{project_id}/obj/{attribute_id}/?grain=1 where {attr-id} is the numeric identifier of the anchor attribute of the fact table to which the grain was set up. The API will return the structure containing a list of attributes in the grain:

GET /gdc/md/{project-id}/obj/{attr-id}/?grain=1
 
{
   "attribute" : {
      "content" : {
         "grain" : [
            "/gdc/md/{project-id}/obj/101",
            "/gdc/md/{project-id}/obj/102"
         ],
         "pk" : [
            {
               "data" : "/gdc/md/{project-id}/obj/201",
               "type" : "col"
            }
         ],
         "displayForms" : []
      },
      "meta" : {
         "uri" : "/gdc/md/{project-id}/obj/{attr-id}",
         "identifier" : "attr.myfacttable.factsof",
         "title" : "My Fact Table"
      }
   }
}

Construction of ALTER command

After completing ‘Read Existing Grain Settings’ you now have a list of attributes that are in the grain. We may GET their textual identifiers directly from the URI’s we got in the response, or use the more efficient resource /gdc/md/{project-id}/identifiers. In the example above there are two attributes in GRAIN of My Fact Table:

  • attr.mydimension.myattr1 for /gdc/md/{project-id}/obj/101,
  • attr.myfacttable.myattr2 for /gdc/md/{project-id}/obj/101.

Now we are going to add a third attribute to the grain:

ALTER ATTRIBUTE {attr.myfacttable.factsof} SET GRAIN {attr.mydimension.myattr1}, {attr.myfacttable.myattr2}, {attr.myfacttable.myattr3};

We just repeated the attributes that are already in the grain, and added one more to the list. Similarly we can also drop an attribute from grain if it is not needed.

A primary key check is performed. Ensure the new grain is unique if an attribute was removed.

 

You can now check the recorded changes by repeating GET request to the attribute metadata.

MAQL DDL examples

This example defines a new fact table Account and sets its grain:

CREATE ATTRIBUTE {attr.account.factsof} VISUAL (TITLE “Account”) AS {f_account.id} PRIMARY SET GRAIN {attr.account.personid}, {date.created}, {attr.project.projectid};

 

This example changes the grain for existing attribute Account:

ALTER ATTRIBUTE {attr.account.factsof} SET GRAIN {attr.account.personid}, {date.created};

 

This example removes grain settings from attribute Account:

ALTER ATTRIBUTE {attr.account.factsof} UNSET GRAIN;

 

Example 1. New LDM development: adding grain to an attribute without a connection point

The following data model example includes the attributes Invoice number, Line number, and Item, and the facts Quantity and Price.

Each Invoice line must have a unique combination of Invoice number and Line number, so these attributes always occur in a unique combination for the fact table.The dataset Invoice Line might look like this:

Invoice numberLine numberItemQuantityPrice
3001plastic pumpkin772
3002pirate costume150
3003lollipops, 1 kg16
3011glow sticks50100
3012pirate costume150

In this example, the price of the first 72 first records is wrong. If you upload correct data (300, 1, plastic pumpkin, 7, 70) without setting the grain, this record will be added as new row, making your data incorrect. To avoid this, set the grain for Invoice Line with this MAQL DDL:

ALTER ATTRIBUTE {attr.invoiceline.factsof} SET GRAIN {attr.invoiceline.invoicenumber}, {attr.invoiceline.linenumber};

SYNCHRONIZE {dataset.invoiceline} PRESERVE DATA;

 

Now if you upload data (300, 1, plastic pumpkin, 7, 70), GoodData replaces the data in the first row instead of adding a new row.

Example 2. Existing LDM development: References in fact table grain

Using the model from 'Example 1. New LDM development: adding grain to an attribute without a connection point' (see above) we now need to adjust the grain to contain the reference to the Item dataset in addition. Because its connection point is Item number we will need to find the identifier of this attribute. It can be found in the LDM modeler in the right pane; for example {attr.item.itemnumber}. To add it into the fact table’s grain, we must construct the following MAQL DDL script:

ALTER ATTRIBUTE {attr.invoiceline.factsof} SET GRAIN {attr.invoiceline.invoicenumber}, {attr.invoiceline.linenumber}, {attr.item.itemnumber};

SYNCHRONIZE {dataset.invoiceline} PRESERVE DATA;

 

In this example the syntax is almost the same as in the previous case. We have added one more identifier. If we wanted remove it again, we would use exactly the same command as before.

 

Example 3. Existing LDM development: Grain instead of a connection point

Snapshotting is a very common technique in data modeling. To be able to upload data incrementally with update ability it looks often like in the picture. The fact table Opportunity Snapshot contains connection point Opportunity Snapshot ID fulfilled by some MD5-concatenation of Opp. ID and Snapshot Date. When you need to update a particular opportunity record in the fact table for some specific date in history, you can identify it by these two values.

You can now move this logic to the GoodData platform: instead of Opportunity Snapshot ID this example uses two references: Snapshot Dateand Opp. ID. First you must drop the connection point. se MAQL DLL here as it is critical for the rest of the job:

DROP {label.opportunitysnapshot.opportunitysnapshotid};

SYNCHRONIZE {dataset.opportunitysnapshot} PRESERVE DATA;

 

This allows you to add grain definition to the Opportunity Snapshot fact table. Because other object identifiers are not affected, do not be confused that the attribute is not now called “Records of”, but Opportunity Snapshot ID with the attr.opportunitysnapshot.opportunitysnapshotid identifier. The remaining commands share the same pattern as the previous examples:

ALTER ATTRIBUTE {attr.opportunitysnapshot.opportunitysnapshotid} SET GRAIN {snapshot.date}, {attr.opportunity.oppid};

SYNCHRONIZE {dataset.opportunitysnapshot} PRESERVE DATA;

 

When the script is processed the fact table will be scanned for duplicities in the new compound primary key. As the connection point exists this is just a formal check. Once it finishes the process is complete.

  • No labels