Load Your Own Redshift Data into GoodData Workspace

In this tutorial, you will learn how to load your own data from a Redshift cluster into your GoodData workspace. This tutorial expands on the Loading Sample Data from Redshift into GoodData guide.

To load data into your workspace, you will perform the following tasks:

  1. Open a new workspace
  2. Create a data load
  3. Set up your data source
  4. Create an output stage
  5. Create a logical data model (LDM)
  6. Start the load

To complete the process, you will execute several API calls using our gray pages, our UI for APIs. You will be asked to provide details such as your IDs for both Redshift and GoodData.

Log in to your Redshift cluster with the account that you plan to use with GoodData. Ensure that your Redshift cluster can be accessed by GoodData. For more information about the required privileges, see GoodData-Redshift Integration Details.


Open a New GoodData Workspace

Ensure that you are logged into your GoodData account.

This tutorial presumes that your GoodData domain is secure.gooddata.com.

  • GoodData Free users
    use the link you received in your GoodData Free confirmation email, for example: https://free123ab.na.gooddata.com .
  • GoodData Growth Users
    use the link you received in your GoodData confirmation email, for example: https://yourcompanyname.na.gooddata.com .
  • White-label customers
    use your own white-label domain

For the purpose of this tutorial, you will work with a new workspace (also known as a project).

To create a workspace:

 GoodData Free users - click to expand

Your GoodData account comes with five data-ready workspaces. For this tutorial, select any empty workspace.

Once you select a workspace to work with, you can easily rename it in the Manage section:

 GoodData Growth users - click to expand

Your GoodData Growth account allows you to create ten or more workspaces (projects). To create a workspace, you must have the authorization token. For more details, see Find the Project Authorization Token.


  1. Click the Add workspace button on your welcome screen.

  2. Enter the name of your workspace and your authorization token.

  3. Click Create.

    Your workspace opens.

To return to your welcome screen and create another workspace, click your name in the upper right corner, click Account, then click Active Project.

You can easily rename the workspace (project) it in the Manage section:

 GoodData Enterprise users - click to expand

To create a new project, also called workspace, you must have the authorization token. For more details, see Find the Project Authorization Token.

If you do not have an authorization token, contact GoodData Support.

This procedure assumes that your domain is https://secure.gooddata.com.

If you are a white-labeled customer, replace secure.gooddata.com with your white-labeled domain in the procedure steps when needed. GoodData Free and Growth users, use the domain that you received in your introduction email, such as https://free123ab.na.gooddata.com.


  1. Go to https://secure.gooddata.com/gdc/projects.
    The page for creating a project opens.
  2. In the Title field, enter the name for the new project.

  3. In the Authorization Token field, enter your authorization token.

    Do not enter any information for the summary.

    Leave the other project settings at their defaults.

  4. Click Submit.
    The project/workspace is created and the page with the project's URL opens.
    The project/workspace is immediately available on the GoodData Portal.

Create a Data Loading Process

In this step, you will create a data load that takes care of moving data from your Redshift cluster into your GoodData workspace. This process is called Automated Data Distribution (ADD) and it can be deployed to multiple GooData workspaces.

Ensure, that you have the following information ready:

  • GoodData workspace's Project ID (Find the Project ID)
  • Redshift username and password
  • Redshift database and schema


  1. Go to the Data Integration Console and click the Projects tab.
  2. Click the workspace that you want to use.
    The workspace's page opens.

  3. Click Deploy Process.
    The Deploy process to a project screen appears.
  4. Click Create Data Source.
    The New Data Source screen appears.
  5. Fill in the following fields.
    • Provider: select Redshift
    • Connection URL: enter your Redshift connection string

    • Your Redshift username and password
    • Your Redshift database and schema
    If the connection is successfully validated, the following screen appears.

  7. When your collection is validated, fill in remaining two fields:
    • Output Stage Prefix: enter any prefix you like
    • Data Source Name: enter any name you like
  8. Click Create.
    The Deploy process to project screen reappears.
  9. Enter your process name.
  10. Click Deploy.
    When the process ends, a confirmation message briefly flashes, then the following screen appears:
  11. Click the Metadata tab and make a note of the Data Source ID. You will need it in the next steps.

Ensure that the user you configured in the data source has all necessary privileges. To learn more, see GoodData-Redshift Integration Details.

Create an Output Stage

The Output Stage (OS) is a set of views created in your Redshift schema that serve as a source for loading data into your GoodData workspace.

To help you with creating an OS, we use an API that inspects all tables within the schema defined in your data source and suggests you an SQL DDL script to create the views. You will execute those SQL DDLs on top of your schema.

Ensure, that you have the following information ready:

  • Data Source ID

     If you did not note down the Data Source ID in the previous section...
    Follow these steps:
    1. Click your name in the top right corner and select Data Integration Console.
    2. Click the Projects tab.
    3. Click the relevant Project Name.
      The list of your processes appears.
    4. Click the process you are using and select the Metadata tab.

      Find the Data Source ID at the bottom of the window.

To create the output stage:

  1. Go to the page for generating the Output Stage:


    where {Data_Source_id} is the identifier of the Data Source that you created at the previous step (Create a Data Loading Process).

  2. Click Generate output stage.

    The link for polling for status of the Output Stage generation displays.
  3. Poll (click the link repeatedly) for the status until the page returns SQL DDLs to create GoodData-compatible views.
    Here is an example of an SQL DDLs:

    -- out_orders --
            "id" AS "cp__id",
            "customerid" AS "r__customers",
            "description" AS "a__description",
            "price" AS "f__price",
            "modified_at" AS "d__modified_at"
            , x__timestamp AS x__timestamp /* Remove this line if it's not incremental load */
            , x__client_id AS x__client_id /* Remove this line if loading into all client workspaces */
            , x__deleted AS x__deleted /* Remove this line if inserting only */
    FROM "orders";
  4. Copy the returned SQL DDLs and paste them into your Redshift SQL client.
  5. Review the suggested SQL DDLs and modify them, if needed.
    Note: For the purpose of this tutorial, you can delete all , x__ lines.

    Review the table at the end of this section to ensure the columns in your output stage comply with GoodData conventions.

  6. Execute the SQL DDLs.

The views must comply with the following naming conventions of the GoodData Output Stage:

Element typeColumn prefixColumn name formatExampleNotes
Primary key (connection point)cp__cp__{attribute_name}cp__id
Foreign key (reference)r__r__{referenced_table_name}





If you want to use a shared date dimension for more tables/views, use the same {date_dimension_name}.

Create a Logical Data Model from the Output Stage

Before you load data into your workspace, you need a logical data model (LDM) to determine how the data are handled and displayed.

The LDM enables a layer of abstraction between the information that a GoodData user accesses and the method that is used to store data. We make the process of creating a workspace LDM easy by reading the output stage (we recognize the type of LDM element based on Output Stage view and column names) and generating the model in a JSON format, so you can keep track of changes that are being made using your favorite version control system.

Then, you load the model and apply it to your workspace(s). In this guide, the process of creating an LDM is split into several steps brings better flexibility and greater opportunities to customize the model at any step.

Ensure, that you have the following information ready:

  • Project ID
  • Data Source ID


  1. Go to the page for generating the logical data model:


    where {Data_Source_id} is the identifier of the Data Source.
    The link for polling for status of the LDM generation appears.

  2. Poll (click the link repeatedly) for the status until the gray page returns the JSON structure describing the logical data model.
    Review warnings if they appear and modify your output stage SQL.
    Here is an example of the JSON structure:

      "projectModelView": {
            "projectModel": {
              "datasets": [ {
                "dataset": {
                  "identifier": "dataset.orders",
                  "title": "orders",
                  "anchor": {
                    "attribute": {
                      "identifier": "attr.orders.id",
                      "title": "id",
                      "folder": "orders",
                      "labels": [ {
                        "label": {
                          "identifier": "label.orders.id",
                          "title": "id",
                          "type": "GDC.text",
                          "dataType": "VARCHAR (128)"
                      } ]
                  "attributes": [ {
                    "attribute": {
                      "identifier": "attr.orders.description",
                      "title": "description",
                      "folder": "orders",
                      "labels": [ {
                        "label": {
                          "identifier": "label.orders.description",
                          "title": "description",
                          "type": "GDC.text",
                          "dataType": "VARCHAR (128)"
                      } ]
                  } ],
                  "facts": [ {
                    "fact": {
                      "identifier" : "fact.orders.price",
                      "title": "price",
                      "folder": "orders",
                      "dataType": "DECIMAL (12, 2)"
                  } ],
                  "production": true
  3. Scroll down to the bottom of the page with the JSON until you see the following form appears:

  4. Enter your project ID o your workspace and select the Preserve data option.

  5. Click Publish.
    The polling screen appears.

  6. When the process finishes successfully, the JSON structure appears. The logical data model is created within your workspace.
    If an error screen appears, review the errors and modify the output stage SQL. Then run the process again.

You can review the logical data model in the Manage section.


  1. Go to your workspace.
  2. Click Manage in the top navigation tab.
  3. On the Data tab, click Model.
    The LDM outline appears.

    Note: Depending on the complexity and make up of your data, the actual LDM diagram will be different.

Create and Run a New Schedule

To ensure your GoodData analytics is always using the most up-to-date data, you can create schedule to automate data loads between your Redshift cluster and your GoodData workspace. For the purpose of this Getting Started tutorial, you create a manual schedule.

  1. Go to the Data Integration Console and click the Projects tab.
  2. Select the workspace that you used in the previous step.
  3. Click New schedule.
    The new schedule screen appears.
  4. Select the process name.
  5. In the Runs dropdown, set the frequency of execution to manually.
  6. Leave everything else intact.
  7. Click Schedule.
    The schedule is saved and opens for your preview.
    You are now going to manually run the scheduled process.
  8. Click Run.
  9. Confirm Run.
    The schedule is queued for execution and is run as platform resources are available.
    The process may take some time to complete.
  10. If the schedule fails with errors, fix the errors, and run the schedule again. Repeat until the process finishes with a status of OK, which means that the ADD process has loaded the data to your workspace.
  11. (Optional) In the Runs dropdown, set the frequency of execution to whatever schedule fits your business needs. Click Save.
    The schedule is saved.

Next Steps

In this tutorial, you successfully:

  • Set up the connection between your Redshift cluster and your GoodData workspace
  • Created and scheduled data load (albeit in the manual mode)
  • Created a logical data model

Now with your data are in your GoodData workspace, open Analytical Designer and start analyzing.

For a comprehensive advanced guide on data warehouse integration options, review the Integrate Data Warehouses Directly to GoodData based on Your DW Schema section.