You are viewing our older product's guide. Click here for the documentation of GoodData Cloud, our latest and most advanced product.

Naming Convention for Source Files in Automated Data Distribution v2 for Object Storage Services

When you need to load data to a workspace with a specific logical data model (LDM), you have to map all datasets in the workspace LDM to the source files in your object storage service.

The names of the source files must follow the specific naming convention at the dataset level and the LDM field level.

Dataset Level Mapping

Review the requirements for the names of the source files in GoodData-S3 Integration Details or GoodData-Azure Blob Storage Integration Details.

LDM Field Level Mapping

The following table shows the mapping of element_type to prefix_type:

prefix_type	element_type	Description
a	attr	attribute
cp	attr	connection point (anchor)
f	fact	fact
d	date	date dimension
r		reference
l		label

Attributes, Connection Points, and Facts

If the identifier of an LDM field is the following:

<element_type>.<dataset_name>.<element_name>

then Automated Data Distribution (ADD) v2 for object storage services expects the following column name in the mapped source file:

<prefix_type>__<element_name>

Attribute Labels

If the identifier of an attribute label is the following:

label.<dataset_name>.<attribute_name>.<label_name>

then ADD v2 expects the following column name in the mapped source file:

l__<attribute_name>__<label_name>

References

If dataset.<dataset1_name> is a dataset referenced from dataset.<dataset2_name>, ADD v2 expects the following:

The source column for this reference exists in the corresponding source file.
The name of the source column is r__<dataset1_name>.

Example

In the following table:

The columns represent particular datasets and the corresponding source files.
In a cell, the first line (if present) indicates the identifier of the object, and the second line indicates the corresponding column in the source file.

dataset.state State	dataset.customer Customer	dataset.product Product	dataset.invoice Invoice	dataset.invoiceitem InvoiceItem
attr.state.stateid cp__stateid	attr.customer.customerid cp__customerid	attr.product.productid cp__productid	attr.invoice.invoiceid cp__invoiceid	fact.invoiceitem.quantity f__quantity
label.state.stateid.abbrev l__stateid__abbrev	r__state		r__customer	fact.invoiceitem.price f__price
label.state.stateid.name l__stateid__name			d__invoice	r__product
attr.state.region a__region				r__invoice

Conflict Resolution

Typically, only the last section of an LDM element identifier is used to map the source files in your object storage service to the LDM datasets. This is true when the second section of the identifier matches the source file that it maps to. For example, the LDM fact fact.person.age in the dataset dataset.person becomes the column f__age in the corresponding source file.

However, if the source file and the LDM dataset do not match, the last two sections of the LDM element identifier become a part of the column name. For example, the LDM fact fact.spouse.age in the dataset dataset.person becomes the column f__spouse__age.

Special Columns in Source Files

In addition to the standard columns mapped to the LDM elements in the source files, there are the following optional columns that, when present, influence ADD v2 behavior:

The x__client_id column enables data distribution from a single source file into multiple workspaces based on the values in this column. When data is loaded to a particular workspace, only the records with the value in the x__client_id column equal to the workspace client ID are loaded into the corresponding dataset in the workspace. For more information about the client ID, see Automated Data Distribution v2 for Object Storage Services and Set Up Automated Data Distribution v2 for Object Storage Services.
The x__deleted column enables the data deletion functionality on a single file (see Load Modes in Automated Data Distribution v2 for Object Storage Services).

Set Up Automated Data Distribution v2 for Object Storage Services