Home \| Table of Contents	HR Example: Connecting multiple datasets together
Prev	CloudConnect Project Examples	Next

Chapter 12. HR Example: Connecting multiple datasets together

This example shows how to load three connected datasets together. There are three hierarchically connected datasets to load in the HR example: Department -> Employee -> Salary.

Figure 12.1. HR example LDM

The CloudConnect graph loads all three datasets.

Figure 12.2. HR CloudConnect Graph

As the datasets are connected, we need to make sure that the datasets are not loaded in parallel but in sequence. We need to first load the Department, then Employee, and finally the Salary dataset. CloudConnect uses so called phases to execute different parts of the graph sequentially. The phase can be assigned to a connected branch of the graph by simple right-clicking at a specific component and selecting the Set phase popup menu item.

Figure 12.3. Setting phase

The phase is an integer number. Components with the lower phase execute sooner in the sequence than the components with the higher phase. Note, that the component's phase is indicated as a small number in the top left corner of the component's rectangle.

The Department dataset contains only one attribute called Department ID and one label called Name. In fact this attribute has two textual labels: Department ID and Name. GoodData platform needs to know which of these two labels uniquely identify any Department record to correctly load the data. Lets explain this on a simple example. Lets assume that we want to load the following employee records to the GoodData platform:

Table 12.1. Employee records

Employee ID	Employee Name	Department ID	Department Name	Salary
1	John Simons	SW	Sales	$170k
2	Jeff Nicholson	SE	Sales	$180k
3	Sarah Robinson	MKTG	Marketing	$220k

The platform needs to break down these records to two attributes and one fact:

Department attribute is created from the Department ID and Department Name columns.
Employee attribute is created from the Employee ID and Employee Name columns.
Salary fact is created from the Salary column.

Now lets look more closely at the Department attribute. The GoodData platform needs to designate one of the Department's columns as primary. Each distinct value of the primary column identifies a record of the attribute. We can choose the Department ID as the primary column and end up with three Department records identified by the values: SW, SE, and MKTG or select the Department Name and end up with only two Department records identified by the values: Sales, and Marketing.

The Field mapping dialog of the GD Dataset Writer component asks for identification of the primary label for all attributes that have more than one label.

Figure 12.4. Primary label identification

The Employee dataset references the Department dataset in the project's data model. As we saw earlier, the Department dataset has one attribute and two labels Department ID and the Name. So there are two options, how to reference any Department record from an Employee record. The CloudConnect needs to know what label you choose. It first asks you for this label during the Employee metadata creation to give the field that references the Department the right name (New Metadata → Extract from GoodData Dataset and select the Employee dataset).

Figure 12.5. Primary label identification in the Extract Metadata from GoodData Dataset Dialog

Selection of the correct Department label that is referenced from the Employee records is very important in the Employee's GD Dataset Wizard's Field mapping dialog.

Figure 12.6. Primary label identification in the Field mapping Dialog

Prev	Up	Next
Examples Setup	Home \| Table of Contents	Forex Example: Using the Time Dimension