In this tutorial, you will learn how to load your own data from a BigQuery workspace into your GoodData project. This tutorial expands on the Load Sample Data from BigQuery into GoodData guide.
To load data into your workspace, you will perform the following tasks:
- Open a new GoodData workspace
- Create an output stage
- Create a logical data model (LDM)
- Set up your data source
- Start the load
Log in to your BigQuery workspace with the account that you plan to use with GoodData. Ensure that the user you configured in the data source has all necessary privileges and that your BigQuery workspace can be accessed by GoodData. For more information about the required privileges, see GoodData-BigQuery Integration Details.
BigQuery and GooData integration requires the following role levels: bigquery.dataViewer and bigquery.jobUser.
Ensure, that you have the following information ready:
- BigQuery project and dataset
BigQuery Service Account key in JSON format so you can extract
- private key
- GoodData workspace's Project ID (Find the Project ID)
Open a New GoodData Workspace
Ensure that you are logged into your GoodData account. This tutorial presumes that your GoodData domain is
- GoodData Free users
use the link you received in your GoodData Free confirmation email, for example:
- GoodData Growth users
use the link you received in your GoodData confirmation email, for example:
- White-label customers
use your own white-label domain
For the purpose of this tutorial, you will work with a new workspace (also known as a project).
To create a workspace:
Your GoodData account comes with five data-ready workspaces. For this tutorial, select any empty workspace.
Once you select a workspace to work with, you can easily rename it in the Manage section:
Your GoodData Growth account allows you to create ten or more workspaces (projects). To create a workspace, you must have the authorization token. For more details, see Find the Project Authorization Token.
- Click the Add workspace button on your welcome screen.
Enter the name of your workspace and your authorization token.
Click Create.Your workspace opens.
To return to your welcome screen and create another workspace, click your name in the upper right corner, click Account, then click Active Project.
You can easily rename the workspace (project) it in the Manage section:
If you do not have an authorization token, contact GoodData Support.
This procedure assumes that your domain is
If you are a white-labeled customer, replace
secure.gooddata.com with your white-labeled domain in the procedure steps when needed. GoodData Free and Growth users, use the domain that you received in your introduction email, such as
- Go to https://secure.gooddata.com/gdc/projects.
The page for creating a project opens.
In the Title field, enter the name for the new project.
In the Authorization Token field, enter your authorization token.
Do not enter any information for the summary.
Leave the other project settings at their defaults.
- Click Submit.
The project/workspace is created and the page with the project's URL opens.
The project/workspace is immediately available on the GoodData Portal.
Create a Data Source
To connect your BigQuery workspace and your GoodData workspace, follow these steps:
- Click your name in the top right corner, select Data Integration Console, then click the Data sources tab.
- Click BigQuery as your data warehouse. Alternatively, click Create data source in the bottom left corner.
The connection parameter screen appears.
- Fill in the required fields.
- Click Test connection. If the connection succeeds, the green confirmation message appears.
The screen with your connection details appears.
Create an Output Stage
The Output Stage is a set of views created in your BigQuery dataset that serve as a source for loading data into your GoodData workspace.
To create the output stage:
- On your Connection screen, click Create in the Create output stage gray tab.
The Create output stage window appears.
- GoodData engine will analyze your data structure and create suggested queries that you execute in your data warehouse.
Note the Output Stage naming conventions option.
- Click Copy to clipboard and paste the queries into your BigQuery SQL client.
- In your BigQuery client, review the suggested SQL DDLs and modify them, if needed, to match your needs and comply with GoodData naming conventions.
- Execute the SQL DDLs.
- Close the Create output stage window to return to your Connection screen.
Create a Logical Data Model from the Output Stage
Before you load data into your workspace, you need a logical data model (LDM) to determine how the data are handled and displayed.
The LDM enables a layer of abstraction between the information that a GoodData user accesses and the method that is used to store data. In this step, you use the Output Stage (view and the column names) to create a logical data model.
Then, you load the model and apply it to your workspace(s).
- On your Connection screen, click Publish into workspace.
Enter or select the workspace into which you want to publish your logical data model.
- Click Select.
- On the screen that appears, select the Preserve data option.
- Click Publish.
If your logical data model is published successfully, the following message appears:
Note: If publishing LDM fails, you will see an error message prompting you to make necessary corrections.
- Click the Visit data load page link.
The Data Load Process screen opens within the Data Integration Console page. Proceed to the next section to load data from the warehouse into your GoodData workspace.
Review and update your LDM
While in the Data Integration Console, click Model data in the top navigation console to open the LDM Modeler interface where you can review and Update a Logical Data Model.
Note: Depending on the complexity and make up of your data, the actual LDM diagram will be different.
Create a Data Load Process
In this step, you will create a data load that takes care of moving data from your BigQuery workspace into your GoodData workspace. This process is called Automated Data Distribution (ADD) and it can be deployed to multiple GoodData workspaces.
Note: The following guide presumes that you have successfully published your logical data model and continue to create a data load process. You can start creating a data load process at any time by clicking Create data load process in the Data process tab.
To continue with the process following creating the logical data model, follow these steps:
- Click Deploy Process.
The Deploy process to a project screen appears. ADD and the data source you created are preselected.
- On the next screen, enter your Process Name of choice.
- Click Deploy.
When the process ends, the following screen appears:
Create and Run a New Schedule
To ensure your GoodData analytics is always using the most up-to-date data, you can create schedule to automate data loads between your BigQuery workspace and your GoodData workspace. For the purpose of this Getting Started tutorial, you create a manual schedule.
- Go to the Data Integration Console and click the Projects tab.
- Select the project that you used in the previous step.
- Click Create new schedule.
The new schedule screen appears.
- Select the process name.
- In the Runs dropdown, set the frequency of execution to manually.
- Leave everything else intact.
- Click Schedule.
The schedule is saved and opens for your preview.
You are now going to manually run the scheduled process.
- Click Run.
- Confirm Run.
The schedule is queued for execution and is run as platform resources are available.
The process may take some time to complete.
- If the schedule fails with errors, fix the errors, and run the schedule again. Repeat until the process finishes with a status of OK, which means that the ADD process has loaded the data to your workspace.
- (Optional) In the Runs dropdown, set the frequency of execution to whatever schedule fits your business needs. Click Save.
The schedule is saved.
Summary and Next Steps
In this tutorial, you successfully:
- Set up the connection between your Redshift cluster and your GoodData workspace
- Created and scheduled data load (albeit in the manual mode)
- Created a logical data model
Now with your data are in your GoodData workspace, you can either:
- open Analytical Designer to start begin analyzing your data
- or you can review and Update a Logical Data Model