GoodData-Azure Blob Storage Integration Details

When setting up direct distribution of data from source files in your Azure Blob Storage service, pay attention to the considerations and best practices listed in this article.

Authentication Methods

The following authentication methods are supported:

  • Shared access signature (access is limited by a time interval and permissions set)
  • Access key

Minimum Requirements for Shared Access Signature

If you want to use the shared access signature as an authentication method, make sure to meet the following minimum requirements:

  • For the allowed resource types, Container and Object are selected.
  • For the allowed permissions, Read and List are selected.

Source Files

The source files are any of the following file types:

  • Raw CSV files and compressed CSV files (.zip or .gz)
  • Raw Parquet files or compressed Parquet files (.parquet or .parquet.gz)

Prepare one source file per dataset in your logical data model (LDM). When mapping your LDM objects, you cannot map more than one field to the same source column.

File Name

Use the following format of the source file name:

  • {dataset_name} is the name of the dataset in your LDM to which the data from this file will be loaded.

  • {timestamp} is a timestamp in the yyyyMMddHHmmss format specifying when the data was exported from the source database. The timestamp lets Automated Data Distribution (ADD) v2 (see Automated Data Distribution v2 for Object Storage Services) detect which file should be uploaded to the dataset. The timestamp must increase with each consequent file. If a source file with timestamp T1 was loaded to the dataset, a new source file for this dataset must have a timestamp greater than T1.

  • {load_mode} is the mode of loading data from the source file to the dataset.

    • To set full mode for a source file, specify full.
    • To set incremental mode for a source file, specify inc.
  • {part} is a suffix in the partX format where X is a positive integer (1, 2, 3, …) that is used when you split a large source file (see Size) into a few smaller files. Add the {part} section to each of those smaller files to indicate what part of the large file it is: part1, part2, part3, and so on. Multiple source files for the same dataset, with the same timestamp and the {part} section, will all be loaded and concatenated.

  • A compressed file (.zip or .gz) must contain one CSV or Parquet file with the same name. For example, the compressed file must contain invoiceItems_20200529090000_inc.csv.

  • A compressed Parquet file must be named as .parquet.gz.

The following are examples of file names:

  • products_20200529090000_full.gz
  • products_20200529090000_full.parquet
  • products_20200529090000_full.parquet.gz
  • products_20200529090000_full.csv.gz

File Structure

The structure of the source files must correspond to your LDM. Every time you change the LDM, reflect the changes in the source files.


  • Keep the size of the source files:  

    • Under 50 GB for a raw CSV file
    • Under 10 GB for a Parquet file
    • Under 10 GB for compressed files (.zip or .gz)
  • Split larger files into smaller files.

  • Keep the total size of the files in one load:

    • Under 1 TB for raw CSV or Parquet files
    • Under 200 GB for compressed files (.zip or .gz)
  • Divide larger amounts of data into multiple loads. For example, place a portion of the source files with timestamp 1 in the Azure Blob storage, and run the ADD v2 process to load these files. Then, place the remaining source files with timestamp 2, and run the ADD v2 process again.


Depending on the amount of data and the load frequency, you can organize your source files in the following ways:

  • Place the source files directly to the folder according to the path in the Data Source (see Create a Data Source).

  • Organize the source files into sub-folders per date. The timestamps in the source files in a sub-folder must correspond with the date used in this sub-folder.

    |--- 20200529
    | |--
    | |--
    | |--
    |--- 20200530
    | |--
    | |--
    | |--