Naming Convention for Source Files in Automated Data Distribution v2 for Object Storage Services

When you need to load data to a workspace with a specific logical data model (LDM), you have to map all datasets in the workspace LDM to the source files in your object storage service.

The names of the source files must follow the specific naming convention at the dataset level and the LDM field level.

Dataset Level Mapping

Review the requirements for the names of the source files in GoodData-S3 Integration Details or GoodData-Azure Blob Storage Integration Details.

LDM Field Level Mapping

The following table shows the mapping of element_type to prefix_type:

prefix_typeelement_typeDescription
aattrattribute
cpattrconnection point (anchor)
ffactfact
ddatedate dimension
r reference
l label

Attributes, Connection Points, and Facts

If the identifier of an LDM field is the following:

<element_type>.<dataset_name>.<element_name>

then Automated Data Distribution (ADD) v2 for object storage services expects the following column name in the mapped source file:

<prefix_type>__<element_name>

Attribute Labels

If the identifier of an attribute label is the following:

label.<dataset_name>.<attribute_name>.<label_name>

then ADD v2 expects the following column name in the mapped source file:

l__<attribute_name>__<label_name>

References

If dataset.<dataset1_name> is a dataset referenced from dataset.<dataset2_name>, ADD v2 expects the following:

  • The source column for this reference exists in the corresponding source file.
  • The name of the source column is r__<dataset1_name>.

Example

In the following table:

  • The columns represent particular datasets and the corresponding source files.
  • In a cell, the first line (if present) indicates the identifier of the object, and the second line indicates the corresponding column in the source file.

dataset.state

State

dataset.customer

Customer

dataset.product

Product

dataset.invoice

Invoice

dataset.invoiceitem

InvoiceItem

attr.state.stateid

cp__stateid

attr.customer.customerid

cp__customerid

attr.product.productid

cp__productid

attr.invoice.invoiceid

cp__invoiceid

fact.invoiceitem.quantity f__quantity

label.state.stateid.abbrev

l__stateid__abbrev
 r__state   r__customer

fact.invoiceitem.price

f__price

label.state.stateid.name

l__stateid__name
    d__invoice  r__product

attr.state.region

a__region
     r__invoice

Conflict Resolution

Typically, only the last section of an LDM element identifier is used to map the source files in your object storage service to the LDM datasets. This is true when the second section of the identifier matches the source file that it maps to. For example, the LDM fact fact.person.age in the dataset dataset.person becomes the column f__age in the corresponding source file.

However, if the source file and the LDM dataset do not match, the last two sections of the LDM element identifier become a part of the column name. For example, the LDM fact fact.spouse.age in the dataset dataset.person becomes the column f__spouse__age.

Special Columns in Source Files

In addition to the standard columns mapped to the LDM elements in the source files, there are the following optional columns that, when present, influence ADD v2 behavior: