Configure a Brick

Configuring the bricks is part of the Production Implementation phase of the data pipeline building process (see Phases of Building the Data Pipeline).

Each brick has its own set of parameters. Some parameters are configured in the configuration file, and the rest are configured in a brick's schedule. For the parameters for each brick type, see Downloaders, Integrators, and Executors. For each brick, the appropriate section provides parameters for the configuration file and for the brick's schedule separately.

Contents:

Configuration File

The configuration file is a JSON file that specifies input parameters for a brick, which vary depending on the brick type (remote location, access credentials, source data structure, load mode, and so on). For example, if you have CSV Downloader, the configuration file would describe the file location, file structure, properties of a manifest file, and so on.

Do not put any sensitive information (passwords, secret keys, and so on) to the configuration file.

Provide any sensitive information only as secure parameters in a brick's schedule. Secure parameter values are encrypted and do not appear in clear-text form in any GUI or log entries.

The configuration file would cover configurations for all your downloaders and ADS Integrator. You do not have to create separate configuration files for each downloader or ADS Integrator separately. Once you have created configurations for all your bricks, combine them in one file and name it configuration.json.

Example: One instance of one downloader

This example uses CSV Downloader. One instance of CSV Downloader is processed by one instance of ADS Integrator.

{
  "entities": {
    "Orders": {"global":{"custom":{"hub":["id","id_customers","id_sales"]}}},
    "Customers": {"global":{"custom":{"hub":["id"]}}},
    "Sales": {"global":{"custom":{"hub":["id","id_merchant"]}}},
    "Merchants": {"global":{"custom":{"hub":["id"]}}}
  },
  "downloaders": {
    "csv_downloader_prod": {
      "type": "csv",
      "entities": ["Orders","Customers","Sales","Merchants"]
    }
  },
  "csv": {
    "type": "s3",
    "options": {
      "bucket": "acme_s3_bucket",
      "access_key": "ABCD1234",
      "folder": "acme/manifest",
      "generate_manifests": true,
      "data_structure_info": "acme/feed/feed.txt",
      "data_location": "acme/input_data",
      "move_data_after_processing_to_path": "acme/input_data/processed",
      "files_structure": {
      }
    }
  },
  "integrators": {
    "ads_integrator": {
      "type": "ads_storage",
      "batches": ["csv_downloader_prod"]
    }
  },
  "ads_storage": {
    "instance_id": "DW1234567890",
    "username": "dwh-john.doe@acme.com",
    "options": {}
  }
}

Example: Two instances of the same downloader

This example uses CSV Downloader. Both instances of CSV Downloader are processed by one instance of ADS Integrator.

{
  "entities": {
    "Orders": {"global":{"custom":{"hub":["id","id_customers","id_sales"]}}},
    "Customers": {"global":{"custom":{"hub":["id"]}}},
    "Sales": {"global":{"custom":{"hub":["id","id_merchant"]}}},
    "Merchants": {"global":{"custom":{"hub":["id"]}}}
  },
  "downloaders": {
    "csv_downloader_customers": {
      "type": "csv",
      "settings": "csv_customers",
      "entities": ["Orders","Customers"]
    },
    "csv_downloader_merchants": {
      "type": "csv",
      "settings": "csv_merchants",
      "entities": ["Sales","Merchants"]
    }
  },
  "csv_customers": {
    "type": "s3",
    "options": {
      "bucket": "acme_s3_bucket",
      "access_key": "ABCD1234",
      "folder": "acme/manifest",
      "generate_manifests": true,
      "data_structure_info": "acme/feed/feed.txt",
      "data_location": "acme/input_data",
      "move_data_after_processing_to_path": "acme/input_data/processed",
      "files_structure": {
      }
    }
  },
  "csv_merchants": {
    "type": "sftp",
    "options": {
      "username": "john.doe@acme.com",
      "host": "ftp.acme.com",
      "auth_mode": "password",
      "files_structure": {
        "skip_rows": 2,
        "column_separator": ";"
      }
    }
  },
  "integrators": {
    "ads_integrator": {
      "type": "ads_storage",
      "batches": ["csv_downloader_customers","csv_downloader_merchants"]
    }
  },
  "ads_storage": {
    "instance_id": "DW1234567890",
    "username": "dwh-john.doe@acme.com",
    "options": {}
  }
}

Example: Two instances of the same downloader and one instance of a different downloader

This example uses CSV Downloader and Salesforce Downloader. Both instances of CSV Downloader are processed by one instance of ADS Integrator. One instance of Salesforce Downloader is processes by another instance of ADS Integrator.

{
  "entities": {
    "Orders": {"global":{"custom":{"hub":["id","id_customers","id_sales"]}}},
    "Customers": {"global":{"custom":{"hub":["id"]}}},
    "Sales": {"global":{"custom":{"hub":["id","id_merchant"]}}},
    "Merchants": {"global":{"custom":{"hub":["id"]}}},
    "Opportunity": {"global":{"custom":{"hub":["id"],"timestamp":"SysDate"}}},
    "OpportunityHistory": {"global":{"custom":{"hub":["id"],"timestamp":"CreatedDate"}}} 
  },
  "downloaders": {
    "csv_downloader_customers": {
      "type": "csv",
      "settings": "csv_customers",
      "entities": ["Orders","Customers"]
    },
    "csv_downloader_merchants": {
      "type": "csv",
      "settings": "csv_merchants",
      "entities": ["Sales","Merchants"]
    },
    "sf_downloader": {
      "type": "sfdc",
      "entities": ["Opportunity","OpportunityHistory"]
    }
  },
  "csv_customers": {
    "type": "s3",
    "options": {
      "bucket": "acme_s3_bucket",
      "access_key": "ABCD1234",
      "folder": "acme/manifest",
      "generate_manifests": true,
      "data_structure_info": "acme/feed/feed.txt",
      "data_location": "acme/input_data",
      "move_data_after_processing_to_path": "acme/input_data/processed",
      "files_structure": {
      }
    }
  },
  "csv_merchants": {
    "type": "sftp",
    "options": {
      "username": "john.doe@acme.com",
      "host": "ftp.acme.com",
      "auth_mode": "password",
      "files_structure": {
        "skip_rows": 2,
        "column_separator": ";"
      }
    }
  },
  "sfdc": {
    "username": "sf-john.doe@acme.com",
    "token": "woGxtUDnCXFlsEHXwGnqtAMZ",
    "client_id": "0123A000001Jqlh",
    "client_logger": true,
    "step_size": 14
  },
  "integrators": {
    "ads_integrator_csv": {
      "type": "ads_storage",
      "batches": ["csv_downloader_customers","csv_downloader_merchants"]
    },
    "ads_integrator_sfdc": {
      "type": "ads_storage",
      "entities": ["Opportunity","OpportunityHistory"]
    }
  },
  "ads_storage": {
    "instance_id": "DW1234567890",
    "username": "dwh-john.doe@acme.com",
    "options": {}
  }
}

Brick Schedule

Bricks are deployed and scheduled via the Data Integration Console in the same manner as any other data load process (see Deploy a Data Loading Process for a Data Pipeline Brick). When scheduling a brick, you can add the brick's parameters to its schedule including secure parameters for storing sensitive information (see Schedule a Data Load).

For the schedule parameters for each brick type, see DownloadersIntegrators and Executors. For each brick, the appropriate section provides parameters for the brick's schedule separately.

Powered by Atlassian Confluence and Scroll Viewport.