You are viewing our older product's guide. Click here for the documentation of GoodData Cloud, our latest and most advanced product.

Deploy a Data Loading Process for Automated Data Distribution v2

You deploy a process for Automated Data Distribution (ADD) v2 as part of setting up direct data distribution from data warehouses (for example, Snowflake or Redshift) and object storage services (for example, Amazon S3 or Microsoft Azure Blob).

If you have only one workspace to load data to, deploy the ADD v2 process to this workspace.
If you have multiple workspaces to load data to and you have the workspaces organized into segments (see Set Up Automated Data Distribution v2 for Data Warehouses and Set Up Automated Data Distribution v2 for Object Storage Services), deploy the ADD v2 to the service workspace.

For more information about distributing data from various data warehouses and object storage services, see Direct Data Distribution from Data Warehouses and Object Storage Services.

If you are using Automated Data Distribution (ADD) v1 (see Automated Data Distribution Reference), you do not have to deploy the process. The process is automatically created after you configure the Output Stage parameters for the workspace where you want to load data from Data Warehouse (see Use Automated Data Distribution). The process is created under the name of “Automated Data Distribution”. You can create schedules for it in the Data Integration Console (see Schedule a Data Load).

Steps:

From the Data Integration Console (see Accessing Data Integration Console), click Workspaces.
Click the name of the workspace where you want to deploy an ADD v2 process.
Click Deploy Process. The deployment dialog opens.
From the Component dropdown, select Automated Data Distribution.
From the Data Source dropdown, select the Data Source that you want to use within the ADD v2 process (describing your data warehouse or object storage service).
When you later schedule the process, you can use references to the parameters from the added Data Sources instead of entering the explicit values. For more information, see Reuse Parameters in Multiple Data Loading Processes.
Specify the workspaces to load the data to:
- If you want to load the data to the current workspace, select Current Workspace. Optionally, if you want to load only the data that is related to a specific customer, enter its client ID.
  If you have only one workspace to load data to, the Current Workspace option is the only option available and it is pre-selected by default.
- If you have multiple workspaces to load data to and you have the workspaces organized into segments (see Set Up Automated Data Distribution v2 for Data Warehouses and Set Up Automated Data Distribution v2 for Object Storage Services), select Segment (LCM) and then select the segment containing the workspaces to distribute the data to.
Enter the name of the process. The alias will be automatically generated from the name. You can update it, if needed.
The alias is a reference to the process, unique within the workspace. The alias is used when exporting and importing the data pipeline (see Export and Import the Data Pipeline).
Click Deploy. The process is deployed.

You can now schedule the deployed data loading process (see Schedule a Data Load).

Delete a Data Source

Deploy a Data Loading Process for a Data Pipeline Brick