Extensible Data Loading Process
CloudConnect is a legacy tool and will be discontinued. We recommend that to prepare your data you use the GoodData data pipeline as described in Data Preparation and Distribution. For data modeling, see Data Modeling in GoodData to learn how to work with Logical Data Modeler.
Creating an extensible data loading process is sometimes little bit tricky. Imagine the following use case. You are extracting data from Salesforce and loading it to the GoodData Platform. After some time, you decide you want to add one field from a Salesforce entity to your GoodData project. Following this article and learning this process will make your life easier.
See the example data loading process:
First of all, we have single metadata definition that describes all fields we are extracting from Salesforce:
Remember that your metadata must be the same. Not even different decimal number formatting.
We are going to add the new field to your Opportunity dataset. Normally, you would have to open each transformation component (in our example it is ExtHashJoin - see the transformation function below) and map each field properly to carry the new field through the process. But, this could be much more easier.
function integer transform() {
$out.0.* = $in.0.*;
$out.0.salesrep = $in.1.name;
if ($in.0.is_closed) {
if ($in.0.is_won) {
$out.0.status = "Won";
}
else {
$out.0.status = "Lost";
}
}
else {
$out.0.status = "Open";
}
return ALL;
}
Do you want to play with this with your data? Download the full example: salesforceExtensible.zip
The trick here is following row from the transformation configuration:
$0.out.* = $0.in.*;
Using this approach, each field from the metadata is propagated to the next step automatically.
Adding another custom field means to:
- Add the field to Metadata (only once!)
- Add the field to SFDC Opportunity reader SOQL and map it on metadata
- Add and map the field to the GoodData Writer
Remember that you have to setup your Salesforce Connection, create new GoodData Project, and check the GDC_PROJECT_ID Parameter in your workspace.prm file.