Integrate a Data Source for a Data Warehouse

This article will explain how to connect the Data Source to your GoodData workspace when you store your source data in a cloud data warehouse.

You can directly connect Data Sources for the following data warehouses:

  • Amazon Redshift
  • Google BigQuery
  • Microsoft Azure SQL Database
  • Microsoft Azure Synapse Analytics
  • Microsoft SQL Server
  • MongoDB Connector for BI
  • MySQL
  • PostgreSQL
  • Snowflake

Each data warehouse has different integration requirements. Before you connect a data warehouse to your workspace, ensure that GoodData can communicate with the Data Source created for your data warehouse.

Prerequisites

Before you start, make sure that you have the following in place:

  • An active GoodData account that you are logged in to with at least one active workspace

  • A GoodData workspace If you do not know how to create one, see Create Workspaces.

  • Access to a supported data warehouse with source data

Tasks You Will Perform

In this article, you will complete the following tasks:

  • Create a Data Source.
  • Connect the Data Source with your GoodData workspace.
  • Create a logical data model (LDM).

Create a Data Source

A Data Source is a place in your GoodData workspace that stores the information about the connection with your data warehouse.

Select your source data warehouse and learn what details you need to establish a connection between your GoodData workspace and your data warehouse.

Azure SQL Database

Log in to your Azure SQL Database with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your Azure SQL Database can be accessed by GoodData. For more information about the required privileges, see GoodData-Azure SQL Database Integration Details.

Ensure that you have the following information ready:

  • Azure SQL Database username and password
  • Azure SQL Database database and schema
  • GoodData workspace’s ID (see Find the Workspace ID)

Azure Synapse Analytics

Log in to your Azure Synapse Analytics with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your Azure Synapse Analytics database can be accessed by GoodData. For more information about the required privileges, see GoodData-Azure Synapse Analytics Integration Details.

Ensure that you have the following information ready:

  • Azure Synapse Analytics username and password
  • Azure Synapse Analytics database and schema
  • GoodData workspace’s ID (see Find the Workspace ID)

BigQuery

The easiest way to integrate your BigQuery data warehouse and your GoodData workspace is by uploading the Google service account key file during establishing the connection. You can also fill in all the details manually.

The Google’s documentation about creating a service account key file will guide you through the process and provide essential information about service account key files.

In short, when you create a service account key file, you will have to:

  • Define the account name
  • Add user roles (bigquery.dataViewer and bigquery.jobUser)
  • Select JSON as the key type

Ensure that you log in to your BigQuery project with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your BigQuery project can be accessed by GoodData. For more information about the required privileges, see GoodData-BigQuery Integration Details.

Before you proceed with establishing the connection between BigQuery and GoodData, ensure that you have the following ready:

  • Google service account key file in the JSON format. GoodData will extract the following information:
    • Client email
    • Private key
    • Google project ID
  • BigQuery dataset name

Microsoft SQL Server

Log in to your Microsoft SQL Server with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your Microsoft SQL Server can be accessed by GoodData. For more information about the required privileges, see GoodData-Microsoft SQL Server Integration Details.

Ensure that you have the following information ready:

  • Microsoft SQL Server username and password
  • Microsoft SQL Server database and schema
  • GoodData workspace’s ID (see Find the Workspace ID)

MongoDB Connector for BI

Log in to your MongoDB Connector for BI with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your MongoDB can be accessed by GoodData.

For more information about the required privileges and supported data types, see GoodData-MongoDB BI Connector Integration Details.

Ensure that you have the following information ready:

  • MySQL username and password
  • MySQL database and schema
  • GoodData workspace’s ID (see Find the Workspace ID)

MySQL

Log in to your MySQL database with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your MysSQL database can be accessed by GoodData. For more information about the required privileges, see GoodData-MySQL Integration Details.

Ensure that you have the following information ready:

  • MySQL username and password
  • MySQL database and schema
  • GoodData workspace’s ID (see Find the Workspace ID)

PostgreSQL

Log in to your PostgreSQL database with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your PostgreSQL database can be accessed by GoodData. For more information about the required privileges, see GoodData-PostgreSQL Integration Details.

Ensure that you have the following information ready:

  • PostgreSQL username and password
  • PostgreSQL database and schema
  • GoodData workspace’s ID (see Find the Workspace ID)

Redshift

Log in to your Redshift cluster with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your Redshift cluster can be accessed by GoodData. For more information about the required privileges, see GoodData-Redshift Integration Details.

Ensure that you have the following information ready:

  • Redshift username and password
  • Redshift database and schema
  • GoodData workspace’s ID (see Find the Workspace ID)

Snowflake

Log in to your Snowflake instance with the account that you plan to use with GoodData. Ensure the user you want to use in the Data Source has all necessary privileges and that your Snowflake instance can be accessed by GoodData. For more information about the required privileges, see GoodData-Snowflake Integration Details.

Ensure that you have the following information ready:

  • Snowflake username and password
  • Snowflake database, warehouse, and schema
  • GoodData workspace’s ID (see Find the Workspace ID)

Steps:

  1. On the top navigation bar, select Data

     The LDM Modeler opens.

  2. Click Sources

     The Data Source page opens.

  3. Select the Data Source to connect to your workspace.

  4. Provide the required information.

    BigQuery only

  1. Click Test connection. The GoodData platform verifies whether it can connect to the data warehouse using the provided information. If the connection succeeds, the confirmation message appears. No data is loaded from the data warehouse at this point.

  2. (Optional) Provide the prefix for the tables in the Output Stage.

  3. Click Save. The Data Source is created. The screen with the connection details opens.

Connect the Data Source with Your GoodData Workspace

Once you verified that the connection between the Data Source and your workspace works, connect it to the LDM Modeler in your workspace.

Steps:

  1. On the Data Source details page, click Connect

     The LDM Modeler opens. The Data Source is connected to the LDM Modeler, and the tables and views from the data warehouse are listed in the left panel.

  2. To add a dataset, drag a table/view from the left panel and drop it in the blank canvas area. 

     A preview of the data similar to the following opens: 
     

  3. Notice what the columns are set to: 

    • The order_line_id, order_id, and order_status columns are correctly detected as attributes.
    • The date column is correctly detected as dates in the yyyy-MM-dd format and will be converted to a separate Date dataset.
    • The price and quantity columns are correctly detected as facts.
  4. If the preview shows a column with numerical values that cannot be used as a numerical constant in an equation (for example, product identifiers) but are detected as facts, set this column to be an attribute. In the example above, Product ID is autodetected as a fact but it should be changed to an attribute.

  5. Verify that the information in the preview is correct, and click Import. A dataset is added. The LDM Modeler displays the structure of the dataset.

  6. To create each additional dataset, repeat Steps 2 through 5.

  7. (Optional) Create a relationship between the datasets (see Create a Relationship between Datasets).

  8. Publish the LDM (see Publish your Logical Data Model).

When the publishing is done, you can continue to scheduling a task for loading the data to your workspace (see Load Data from a Data Source to a GoodData Workspace).