You are viewing our older product's guide. Click here for the documentation of GoodData Cloud, our latest and most advanced product.

Salesforce Downloader

Salesforce Downloader is a component of the data preparation and distribution pipeline (see Data Preparation and Distribution Pipeline). Salesforce Downloader downloads data from the Salesforce API.

Configuration File

Creating and setting up the brick’s configuration file is the third phase of building your data pipeline. For more information, see Phases of Building the Data Pipeline.

For more information about configuring a brick, see Configure a Brick.

Minimum Layout of the Configuration File

The following JSON sample is the minimum layout of the configuration file that Salesforce Downloader can use. If you are using more bricks, you can reuse some sections of this configuration file for other bricks. For example, the ‘downloaders’ section can list more downloaders, and the ‘integrators’ and ‘ads_storage’ sections can be reused by other downloaders and executors (see Configure a Brick).

This sample describes the use case of one instance of Salesforce Downloader that downloads two Salesforce entities.

Copy the sample and replace the placeholders with your values.

You can later enrich the sample with more parameters depending on your business and technical requirements (see Customize the Configuration File).

The salesforce Bulk API can’t access or query compound field types such as compound address or compound geolocation fields. If you specify a compound field type in your JSON, the request will fail.

{  
  "entities": {
    "table_1_name": {
      "global": {
        "custom": {
          "hub": ["column_1_name","column_2_name","column_3_name"],
          "timestamp": "field_name"
        }
      }
    },
    "table_2_name": {
      "global": {
        "custom": {
          "hub": ["column_1_name","column_2_name","column_3_name"],
          "timestamp": "field_name"
        }
      }
    }
  },
  "downloaders": {
    "sf_downloader_id": {
      "type": "sfdc",
      "entities": ["table_1_name","table_2_name"]
    }
  },
  "sfdc": {
    "username": "salesforce_user_name",
    "token": "salesforce_token",
    "client_id": "salesforce_client_id"
  },
  "integrators": {
    "ads_integrator_id": {
      "type": "ads_storage"
    }
  },
  "ads_storage": {
    "instance_id": "data_warehouse_instance_id",
    "username": "dw_email@address.com",
    "options": {}
  }
}

The placeholders to replace with your values:

Name	Type	Mandatory?	Default	Description
entities -> *table_x_name*	string	yes	n/a	The database tables into which Salesforce Downloader downloads the data
entities -> table_x_name -> global -> custom -> hub	array	yes	n/a	The table columns to generate a primary key
entities -> table_x_name -> global -> custom -> timestamp	string	yes	n/a	The name of the field that will be used as a timestamp for incremental downloads from the database NOTE: The referenced field must contain a date.
downloaders -> *sf_downloader_id*	string	yes	n/a	The ID of the Salesforce Downloader instance Example: sf_downloader_1
downloaders -> sf_downloader_id -> entities	array	yes	n/a	The database tables into which Salesforce Downloader downloads the data Must be the same as the database table names in 'entities -> table_x_name'
sfdc -> username	string	yes	n/a	The username for the Salesforce account
sfdc -> token	string	yes	n/a	The token for the Salesforce account
sfdc -> client_id	string	yes	n/a	The client ID for the Salesforce connected app. For more information about creating a connected app, see https://help.salesforce.com/articleView?id=connected_app_create.htm&type=5. NOTE: The app must have access to the client's data.
integrators -> *ads_integrator_id*	string	yes	n/a	The ID of the ADS Integrator instance Example: ads_integrator_1
ads_storage -> instance_id	string	yes	n/a	The ID of the Agile Data Warehousing Service (ADS) instance into which the data is uploaded
ads_storage -> username	string	yes	n/a	The access username to the Agile Data Warehousing Service (ADS) instance into which the data is uploaded

Customize the Configuration File

You can change default values of the parameters and set up additional parameters.

Specify additional load options for a Salesforce Downloader instance

You can set up additional load options for a Salesforce Downloader instance that would apply to all the entities that this instance processes.

"sfdc": {
  "username": "salesforce_user_name",
  "token": "salesforce_token",
  "client_id": "salesforce_client_id",
  "client_logger": true|false, // optional
  "downcase_fields": true|false, // optional
  "boolean_as_string": true|false, // optional
  "default_start_date": "YYYY-MM-DD'T'HH:MM:SSS.SSS'Z", // optional
  "full": true|false, // optional
  "step_size": number_of_days // optional
}

Name	Type	Mandatory?	Default	Description
client_logger	Boolean	no	false	Specifies whether additional logging is enabled If not set or set to 'false', additional logging is not enabled. If set to 'true', additional logging is enabled.
downcase_fields	Boolean	no	false	Specifies whether to change the case of the field names If not set or set to 'false', the case of the field names remains intact. If set to 'true', the field names in the database are switched to lowercase. NOTE: In this case, make sure that the field names listed in the 'hub' parameter are all lowercase as well.
boolean_as_string	Boolean	no	false	Specifies whether to convert Boolean values into strings If not set or set to 'false', Boolean values are not converted to strings. If set to 'true', Boolean values are converted to strings.
default_start_date	string	no	2015-01-01T00:00:00.000Z	The date that is used as follows depending on load mode: In incremental load mode – as the start date at the first download In full load mode – as the date from which the data is downloaded
full	Boolean	no	false	Specifies load mode used for loading data to the database If not set or set to 'false', incremental load is used. If set to 'true', full load is used. The data is always downloaded starting from the date specified in the 'default_start_date' parameter. The 'timestamp' parameter in the 'entities' section is used for partitioning data downloaded via the Salesforce bulk API.
step_size	integer	no	n/a	The maximum number of days for which the data can be downloaded The next load will continue from where the last one ended.

Specify additional load options for the entities

For each entity, you can set up additional load options.

"entities": {
  "table_1_name": {
    "global": {
      "custom": {
        "hub": ["column_1_name","column_2_name","column_3_name"],
        "timestamp": "field_name",
        "ignored_fields": ["field_1_name","field_2_name"], // optional
        "deleted_records": true|false, // optional
        "full": true|false, // optional
        "fields_to_override": [     // optional
          {
            "name": "field_1_name",
            "type": "field_1_type"
          },
          {
            "name": "field_2_name",
            "type": "field_2_type"
          }
        ],
        "fields_to_clean": ["field_1_name","field_2_name"], // optional
        "functions": {     // optional
          "function_name": [
            "field_1_name","field_2_name"
          ]
        }
      }
    }
  }
}

Name	Type	Mandatory?	Default	Description
ignored_fields	array	no	[ ] (empty array, as in: `"ignored_fields": []` )	The fields that should be ignored when downloading data from the API
deleted_records	Boolean	no	false	Specifies how to handle the deleted records If not set or set to 'false', the deleted records are not downloaded. If set to 'yes', the deleted records are downloaded.
full	Boolean	no	false	Specifies load mode used for loading data to the database If not set or set to 'false', incremental load is used. If set to 'true', full load is used.
fields_to_override	array	no	[ ] (empty array, as in: `"fields_to_override": []` )	Changes the type of the listed fields to the type that you defined Example: `"fields_to_override": [ { "name" : "revenue__c", "type" : "decimal-30-10" }, { "name" : "user_added_revenue__c", "type" : "decimal-30-10" } ]`
fields_to_clean	array	no	[ ] (empty array, as in: `"fields_to_clean": []` )	The fields containing scientific notation that have to be converted to decimal notation to avoid potential processing issues Example: `"fields_to_clean": ["fieldA","fieldB"]`
functions	JSON	no	{ } (empty hash, as in: `"functions": {}` )	Custom functions to be applied to the listed fields after the download completes Example: `"functions": { "convertCurrency": [ "balance__c","overdue_balance__c","annualrevenue" ] }`

Customize Salesforce-related settings

Depending on your business and technical requirements, you can set one or more optional parameters affecting Salesforce settings.

"sfdc": {
  "username": "salesforce_user_name",
  "token": "salesforce_token",
  "client_id": "salesforce_client_id",
  "host": "salesforce_endpoint", // optional
  "api_version": "version_number", // optional
  "bulk_query_options": {     // optional
    "option_1_name": "option_1_value",
    "option_2_name": "option_2_value"
  }
}

Name	Type	Mandatory?	Default	Description
host	string	no	n/a	The Salesforce endpoint that Salesforce Downloader should access Example: `test.salesforce.com`
api_version	string	no	53.0	The version of the Salesforce API
bulk_query_options	JSON	no	{ } (empty hash, as in: `"bulk_query_options" : {}` )	The options to be propagated to the query via the Salesforce bulk Query gem that Salesforce Downloader is using. For more information about the Salesforce bulk Query gem, see https://github.com/gooddata/salesforce_bulk_query. Example: `"bulk_query_options": { "single_batch": true, "time_limit": 3600, "check_interval": 300 }`

Add multiple instances of Salesforce Downloader

When you use only one instance of Salesforce Downloader, it reads its remote location parameters from the ‘sfdc’ section.

If you use multiple instances of Salesforce Downloader and each instance has a separate location to download data from, add several ‘sfdc’ sections and give each section a unique name (for example, ‘sfdc_users’ or ‘sfdc_sales’).

"sfdc_users": {
  "username": "salesforce_user_1_name",
  "token": "salesforce_token_1",
  "client_id": "salesforce_client_1_id"
},
"sfdc_sales": {
  "username": "salesforce_user_2_name",
  "token": "salesforce_token_2",
  "client_id": "salesforce_client_2_id"
}

For each instance of Salesforce Downloader in the ‘downloaders’ section, add the ‘settings’ parameter and set it to the name of the section that contains settings for this Salesforce Downloader instance.

"downloaders": {
  "sf_downloader_1_id": {
    "type": "sfdc",
    "settings": "sfdc_users",
    "entities": ["table_1_name","table_2_name"]
  },
  "sf_downloader_2_id": {
    "type": "sfdc",
    "settings": "sfdc_sales",
    "entities": ["table_1_name","table_2_name"]
  }
}

Customize integrators

The ‘integrators’ and ‘ads_storage’ sections define the behavior of integrators.

"integrators": {
  "ads_integrator_id": {
    "type": "ads_storage"
  }
},
"ads_storage": {
  "instance_id": "data_warehouse_instance_id",
  "username": "dw_email@address.com",
  "options": {}
}

If you want to add more parameters or change the defaults, see ADS Integrator.

Schedule Parameters

When scheduling Salesforce Downloader (see Phases of Building the Data Pipeline -> Production Implementation), provide the parameters from this section in the schedule.

General Parameters

Some parameters must be entered as secure parameters (see Configure Schedule Parameters).

Some parameters contain the names of the nesting parameters, which specifies where they belong in the configuration hierarchy. For example, a parameter entered as sfdc|options|password is resolved as the following:

  "sfdc": {
    "options": {
      "password": ""
    }
  }

Enter such parameters as is (see Schedule Example).

Name	Typ	Mandatory?	Secure?	Default	Description
ID	string	yes	no	n/a	The ID of the Salesforce Downloader instance being scheduled Must be the same as the 'downloaders -> sf_downloader_id' parameter in the configuration file (see Minimum Layout of the Configuration File) Example: sf_downloader_1
sfdc\|password	string	yes	yes	n/a	The password for the Salesforce account
sfdc\|client_secret	string	yes	yes	n/a	The client secret for the Salesforce connected app
ads_storage\|password	string	yes	yes	n/a	The password to access the ADS instance to upload the data to

Schedule Example

The following is an example of how you can specify schedule parameters:

Google Sheets Downloader

SQL Downloader