Integrate a Data Source for an Object File Storage
This article will explain how to connect the Data Source to your GoodData workspace when you store your source data in an object file storage.
You can directly connect Data Sources for the following object storage services:
- Amazon S3
- Microsoft Azure Blob Storage
Each object storage service has different integration requirements. Before you connect an object storage service to your workspace, ensure that GoodData can communicate with the Data Source created for your object storage service.
Prerequisites
Before you start, make sure that you have the following in place:
An active GoodData account that you are logged in to with at least one active workspace
A GoodData workspace If you do not know how to create one, see Create Workspaces.
Access to a supported object storage service with source CSV files
You can upload data from files in other formats than CSV, but we recommend that you use CSV files for the purpose of getting familiar with the process. Specifically, CSV files are required to create a logical data model (LDM), therefore using CSV files will help you avoid extra steps of converting the source files from different formats to CSV.
For detailed information about creating an LDM based on an object storage service and the supported formats of the source files, see Create a Logical Data Model from Your Cloud Object Storage Service.
Tasks You Will Perform
In this article, you will complete the following tasks:
- Create a Data Source.
- Download the files from the object storage service.
- Create an LDM.
Create a Data Source
A Data Source is a place in your GoodData workspace that stores the information about the connection with your object storage service.
Select your source object storage service and learn what details you need to establish a connection between your GoodData workspace and your object storage service.
Azure Blob Storage
S3
Steps:
The screenshots in the following steps use the S3 Data Source, but the steps are the same for each Data Source.
On the top navigation bar, select Data.
The LDM Modeler opens.Select Sources.
The Data Source page opens.Select the Data Source to connect to your workspace.
Provide the required information.
Click Test connection. The GoodData platform verifies whether it can connect to the object storage service using the provided information. If the connection succeeds, the confirmation message appears. No data is loaded from the object storage service at this point.
Click Save. The Data Source is created. The screen with the connection details opens.
For detailed information about creating Data Sources, see Create a Data Source.
Download the Files from the Object Storage Service
Download the source CSV files from the object storage service to your computer. You will be using these files to create the LDM.
Create the Logical Data Model
Once you verified that the connection between the Data Source and your workspace works and the source CSV files are downloaded to your computer, create the LDM in your workspace.
Steps:
On the top navigation bar, select Data.
The LDM Modeler opens. You see a blank canvas area in view mode.Click Edit. The LDM Modeler is switched to edit mode.
To add a dataset, drag a CSV file and drop it in the blank canvas area. A preview of the data similar to the following opens:
Notice what the columns are set to:
- The
order_line_id
,order_id
, andorder_status
columns are correctly detected as attributes. - The
date
column is correctly detected as dates in theyyyy-MM-dd
format and will be converted to a separate Date dataset. - The
price
andquantity
columns are correctly detected as facts .
- The
If the preview shows a column with numerical values that cannot be used as a numerical constant in an equation (for example, product identifiers) but are detected as facts, set this column to be an attribute. In the example above,
Product ID
is autodetected as a fact but it should be changed to an attribute.For more information about the data types and how to set them correctly, see Create a Logical Data Model from CSV Files.Verify that the information in the preview is correct, and click Import. A dataset is added. The LDM Modeler displays the structure of the dataset.
If the CSV file that you have imported contains dates, those dates will be imported as a separate Date dataset. These two datasets will be automatically connected with a relationship.To create each additional dataset, repeat Steps 3 through 6.
(Optional) Create a relationship between the datasets (see Create a Relationship between Datasets).
Publish the LDM (see Publish your Logical Data Model).
When the publishing is done, you can continue to scheduling a task for loading the data to your workspace (see Load Data from a Data Source to a GoodData Workspace).
For detailed information about creating an LDM based on an object storage service, see Create a Logical Data Model from Your Cloud Object Storage Service.