Update Data in a Dataset from a CSV File

In a typical workflow, when your logical data model (LDM) is stable, you would automate data loading. To do so, set up data load processes (see Deploy a Process for Automated Data Distribution v2 or Deploy a Data Loading Process for a Data Pipeline Brick) and schedule them to load data regularly (see Scheduling a Process).

In some cases, however, you can benefit from updating data in a specific dataset from a CSV file. For example, you are building the LDM and you need to verify the LDM structure by creating insights.

Check the Field Mapping in the Dataset

Each field (fact or attribute) in the dataset must be unambiguously mapped to a column in the CSV file. During data load, the data from a column in the CSV file will be loaded to the corresponding fact or attribute in the dataset. For more information about the mapping, see Mapping between a Logical Data Model and the Data Source.

For example, here is how the Product Orders dataset is mapped to the product_orders.csv file.

order_id,order_status,product_id,price,quantity
10668-9VYN74-2,Cancelled,210,100.33,1.00
10697-7GBN87-1,Delivered,310,39.35,1.00
10907-0TPH53-3,Delivered,220,50.40,1.00
10778-7INQ32-1,Delivered,230,78.40,1.00
10240-3SBQ40-3,Delivered,160,21.49,1.00
10356-7ZBU60-1,Cancelled,140,21.49,1.00
10700-0ACT49-1,Returned,150,21.82,1.00
10175-0YRN35-3,Delivered,150,19.15,3.00
10152-6HOB25-1,Cancelled,140,29.67,1.00

Steps:

  1. On the top navigation bar, select Data. The LDM Modeler opens in view mode.

  2. Click Edit. The LDM Modeler is switched to edit mode.

  3. Locate the dataset where you want to update data, click More… , then click View details. The dataset details dialog opens.

  4. Switch to the Load configuration tab and check Source Column.

  5. Close the dataset details dialog. You are now going to download the CSV template.

Download the CSV Template

The CSV template is a CSV file with a single header row representing the names of the source columns that are expected in a CSV file from which you will be loading data to this dataset. The CSV template also defines the delimiter (a comma, by default).

Steps:

  1. Click Load on the left.
  2. Under Datasets Loads, locate the dataset where you want to update the data, click the menu button for this dataset, and click Download as a CSV Template. A CSV template file is generated.
  3. Follow the instructions in your browser to save the generated file.

Generate a CSV File with the New Data

Use the CSV template to generate a CSV file with the correct column names and fill it with the data to upload to the dataset.

The data from the CSV file will be uploaded to the dataset in full load mode (the existing data will be overwritten), therefore make sure that the CSV file contains all the data that you want to be in the dataset (not only an incremental update).

The order of the columns in the CSV file is not important.

The CSV file can have more columns than the fields in the dataset. Those extra columns in the CSV file would not have any fields in the dataset mapped and will be ignored at data load.

Load the Data from the CSV File to the Dataset

Once you have the CSV file ready, load the data from this file to the dataset.

Steps:

  1. Click Load on the left.
  2. Under Datasets Loads, locate the dataset where you want to update the data, and click Update from file. You are prompted to browse for the CSV file.
  3. Select the CSV file, and click Import. The loading process starts. When the loading completes, you see a message that the data has been uploaded. Close this message.

You can immediately start analyzing the data. To do so, click Analyze on the top navigation bar. You are redirected to Analytical Designer, where you can create insights from your data. For more information, see Create Insights.