Integrate a Data Source for an Object File Storage

This article will explain how to connect the Data Source to your GoodData workspace when you store your source data in an object file storage.

You can directly connect Data Sources for the following object storage services:

  • Amazon S3
  • Microsoft Azure Blob Storage

Each object storage service has different integration requirements. Before you connect an object storage service to your workspace, ensure that GoodData can communicate with the Data Source created for your object storage service.

Prerequisites

Before you start, make sure that you have the following in place:

  • An active GoodData account that you are logged in to with at least one active workspace

  • A GoodData workspace If you do not know how to create one, see Create Workspaces.

  • Access to a supported object storage service with source CSV files

Tasks You Will Perform

In this article, you will complete the following tasks:

  • Create a Data Source.
  • Download the files from the object storage service.
  • Create an LDM.

Create a Data Source

A Data Source is a place in your GoodData workspace that stores the information about the connection with your object storage service.

Select your source object storage service and learn what details you need to establish a connection between your GoodData workspace and your object storage service.

Azure Blob Storage

Log in to the Azure Blob Storage account that you plan to use with GoodData. Ensure that this account has all necessary privileges and that your Azure Blob Storage can be accessed by GoodData. For more information about the required privileges, see GoodData-Azure Blob Storage Integration Details.

Ensure that you have the following information ready:

  • Azure Blob Storage connection string
  • Path to the source data
  • GoodData workspace’s ID (see Find the Workspace ID)

S3

Log in to your S3 bucket with the credentials that you plan to use with GoodData. Ensure that these credentials are sufficient and that your S3 bucket can be accessed by GoodData. For more information about the required privileges, see GoodData-S3 Integration Details.

Ensure that you have the following information ready:

Steps:

  1. On the top navigation bar, select Data

     The LDM Modeler opens.

  2. Select Sources

     The Data Source page opens.

  3. Select the Data Source to connect to your workspace.

  4. Provide the required information. 

  5. Click Test connection. The GoodData platform verifies whether it can connect to the object storage service using the provided information. If the connection succeeds, the confirmation message appears. No data is loaded from the object storage service at this point.

  6. Click Save. The Data Source is created. The screen with the connection details opens.

Download the Files from the Object Storage Service

Download the source CSV files from the object storage service to your computer. You will be using these files to create the LDM.

Create the Logical Data Model

Once you verified that the connection between the Data Source and your workspace works and the source CSV files are downloaded to your computer, create the LDM in your workspace.

Steps:

  1. On the top navigation bar, select Data

     The LDM Modeler opens. You see a blank canvas area in view mode.

  2. Click Edit. The LDM Modeler is switched to edit mode.

  3. To add a dataset, drag a CSV file and drop it in the blank canvas area. A preview of the data similar to the following opens: 

     

  4. Notice what the columns are set to: 

    • The order_line_id, order_id, and order_status columns are correctly detected as attributes.
    • The date column is correctly detected as dates in the yyyy-MM-dd format and will be converted to a separate Date dataset.
    • The price and quantity columns are correctly detected as facts .
  5. If the preview shows a column with numerical values that cannot be used as a numerical constant in an equation (for example, product identifiers) but are detected as facts, set this column to be an attribute. In the example above, Product ID is autodetected as a fact but it should be changed to an attribute.

  6. Verify that the information in the preview is correct, and click Import. A dataset is added. The LDM Modeler displays the structure of the dataset.

  7. To create each additional dataset, repeat Steps 3 through 6.

  8. (Optional) Create a relationship between the datasets (see Create a Relationship between Datasets).

  9. Publish the LDM (see Publish your Logical Data Model).

When the publishing is done, you can continue to scheduling a task for loading the data to your workspace (see Load Data from a Data Source to a GoodData Workspace).