Users Brick

The users brick helps you manage users within your domain or workspace. You can do the following:

  • Add/remove users to/from a domain
  • Add/remove users to/from a workspace
  • Update a user's information
  • Update a user's role in a workspace

For information about how to use the brick, see How to Use a Brick.

 Contents:

Adding Users to a Domain and a Workspace

Adding a user consists of two consecutive steps:

  1. A user is added to a domain.
    The user can now access the GoodData platform but cannot access any workspace. In the context of access rights, the domain is not connected to any workspace, and the user added to the GoodData platform cannot access any workspace on the platform until added or invited to the workspace.
    A user can be added to the platform by either of the following ways:
    • The user can sign up themselves.
    • The domain admin adds the user.

      The domain admin can see all the users within the domain and manage them, as needed. The domain admin is the only person who can access the domain as an entity. For more information, see Your GoodData Domain.

  2. A user is added or invited to a workspace with a user role assigned (see User Roles).
    The user is now allowed to perform certain operations upon the workspace data according to their role. A user role is a set of permissions that a user is given within a particular workspace.
    A workspace admin can invite a user to a workspace by email, but only a domain admin can add the user directly.

Updating Users in a Domain and a Workspace

When you use the users brick to update the domain, all the users that you provided in the source data are added to the domain.

  • The new users are added.
  • The existing users are updated.
  • If you did not provide an update for a specific user or a user's property, this user/user's property remains intact.

In other words, only the users/properties that you explicitly specified will be updated. If you did not mention a user/property in the input data, this user/property will not be touched.

Let's look at the example:

Notice that John's email was updated while his first name was not.

When you use the users brick to update a workspace, the input file shows how the workspace will look like after the update (declarative mode).

  • Users that are in the workspace but not in the input file will be removed from the workspace.
  • Users that are in the input file but not in the workspace will be added to the workspace.
  • Existing users are updated according to the input data.

Let's look at the example:

The input file is on the left. The workspace with its current data is in the middle.

Notice the following:

  • The user 'todd@example.com' was added to the workspace because they were in the data coming from ETL and not in the workspace.
  • The user 'seth@example.com' was removed from the workspace because they were in the workspace but not in the data coming from ETL.
  • The role of the user 'jane@example.com' was updated to what was in the input file because this user existed in both the input data and the workspace but the role was different.
  • The user 'john@example.com' remained the same as they were identical in both the input data and the workspace.

Prerequisites

Before using the users brick, make sure that the following is true:

  • A domain is implemented at your site, and a domain admin exists.
  • A workspace and a workspace admin exist in your domain.

Input

The users brick expects to receive the data about users.

The brick accepts a list of users with their properties, for example:

project_idloginfirst_namelast_namerole
tspv1le9afb94q47pehiub568ubkkqqw

john.doe@example.com

John

Doe

adminRole

tspv1le9afb94q47pehiub568ubkkqqw

anna.doe@example.com

Anna

Doe

adminRole

To review the values that the 'role' column can contain (that is, user roles that users can have), see User Roles.

Use the role identifiers, not the role names.

  • Correct: adminRole
  • Incorrect: Administrator

You can define custom roles for your workspaces (see Create a Custom User Role).

The values in the 'login' column are case-sensitive and must be written in lowercase.

  • Correct: john.doe@example.com
  • Incorrect: John.Doe@example.com

Minimum Required Input Data

Depending on what synchronization mode you choose (see sync_mode), the users brick treats different categories of input data as mandatory.

For example, the 'add_to_organization' mode requires at least login information to be present in the input data. That is, the input data must contain a column named 'login' with user logins. All the other missing information will be auto-populated, such as:

  • A missing first name will be set to 'FirstName'.

  • A missing second name will be set to 'LastName'.

  • A missing email will be set to be equal to the login.

  • Passwords will be auto-generated.

The 'sync_project' mode, however, requires login and role information to be present in the input data. The 'sync_one_project_based_on_pid' mode needs workspace IDs in addition to login and role information.

To find the minimum required input data for each synchronization mode, see sync_mode.

Mapping Your Column Names to Defaults

If your source file is in the required format but the column names are different from the default names, you can map your column names to the default ones.

For example, in your input data first names are stored in a column called 'abc'. So, you input data would look similar to this:

project_idloginabclast_namerole
tspv1le9afb94q47pehiub568ubkkqqw

john.doe@example.com

John

Doe

adminRole

tspv1le9afb94q47pehiub568ubkkqqw

anna.doe@example.com

Anna

Doe

adminRole

Using the 'first_name_column' parameter, you can map your column name to the default name of the column, which is 'first_name':

"first_name_column": "abc"

You can map multiple column names, for example:

"first_name_column": "abc",
"authentication_modes_column": "auth_mode"


Here is the list of the parameters to use for mapping (for more information about all parameters, see Parameters):

FieldDefault column nameBrick parameter for mapping the default name to your column name
User's first name

first_name

first_name_column

User's last name

last_name

last_name_column

User's login

login

login_column

User's email address

email

email_column

User's role

role

role_column

Workspace ID

project_id

multiple_projects_column

User's password

password

password_column

User's SSO provider

sso_provider

sso_provider_column

User's authentication mode

authentication_modes

authentication_modes_column

User's groupuser_groupsuser_groups_column

Output

After the users brick has completed, you can expect the following results based on the parameters that you specified:

  • New users have been added to the domain.
  • New users have been added to the workspace.
  • The users not specified in the input data have been deleted from the workspace.
  • Existing users' data has been updated according to the input data.

Parameters

When scheduling the deployed brick (see How to Use a Brick and Schedule a Process on the Data Integration Console), add parameters to the schedule.

NameTypeMandatory?DefaultDescription

organization

string

yes

n/a

The name of the domain where the brick is executed

input_source

JSON

yes

n/a

The source to take input data from. For more information on input data JSON structures, see Types of Input Data Sources.

You must encode this parameter using the 'gd_encoded_params' and 'gd_encoded_hidden_params' parameters (see Specifying Complex Parameters).

CLIENT_GDC_HOSTNAMEstringsee 'Description' columnsecure.gooddata.com

The white-labeled domain name in the format of your.domain.com (for example, analytics.mycompany.com)

The 'CLIENT_GDC_HOSTNAME' parameter is mandatory only if your domain is white-labeled and you have defined the 'GDC_USERNAME' and 'GDC_PASSWORD' parameters (see further in this table). Otherwise, the 'CLIENT_GDC_HOSTNAME' is optional.

If you define the 'CLIENT_GDC_HOSTNAME' parameter, you must also define the 'CLIENT_GDC_PROTOCOL' parameter.

The parameter name is case-sensitive and must be written in uppercase.

CLIENT_GDC_PROTOCOLstringsee 'Description' columnhttps

The protocol to transfer data over

The 'CLIENT_GDC_PROTOCOL' parameter is mandatory only if your domain is white-labeled and you have defined the 'GDC_USERNAME' and 'GDC_PASSWORD' parameters (see further in this table). Otherwise, the 'CLIENT_GDC_PROTOCOL' parameter is optional.

If you define the 'CLIENT_GDC_PROTOCOL' parameter, you must also define the 'CLIENT_GDC_HOSTNAME' parameter.

The parameter name is case-sensitive and must be written in uppercase.

GDC_USERNAMEstringnosee 'Description' column

The user under whom you want to execute the brick (must be a domain admin)

If this parameter is not set up, the brick is by default executed:

  • When the brick is run automatically based on the schedule - under the user who deployed the brick (specified in 'Executes under' in the brick schedule)
  • When the brick is run on-demand - under the user who ran the brick on demand (see Running Schedules On-Demand)
  • When the brick is run via API - under the user who submitted the API call

If you define the 'GDC_USERNAME' parameter, you must also define the 'GDC_PASSWORD' parameter.

The parameter name is case-sensitive and must be written in uppercase.

GDC_PASSWORDstringnon/a

(Use only when the 'GDC_USERNAME' parameter is set up) The password for the user that you specified in the 'GDC_USERNAME' parameter.

When referencing this parameter in the brick schedule, enter it as a secure parameter (see Configuring Schedule Parameters).

The parameter name is case-sensitive and must be written in uppercase.

multiple_projects_column

string

no

see 'Description' column

The name of the column in the input data source containing the workspace IDs (PIDs) or client IDs (CIDs) (if the column with PIDs is named differently in your input file, set up mapping)

The default value depends on what synchronization mode (the 'sync_mode' parameter) you are using and can be:

  • ‘project_id’ for the following modes:
    • 'sync_multiple_projects_based_on_pid'
    • 'sync_one_project_based_on_pid'
  • ‘client_id’ for the following modes:
    • ‘sync_one_project_based_on_custom_id’
    • ‘sync_multiple_projects_based_on_custom_id’
    • ‘sync_domain_client_workspaces’

sync_mode

string

no

n/a

See sync_mode.

first_name_column

string

no

first_name

The name of the column in the input data source containing users' first names (if the column with first names is named differently in your input file, set up mapping)

last_name_column

string

no

last_name

The name of the column in the input data source containing users' last names (if the column with last names is named differently in your input file, set up mapping)

login_column

string

no

login

The name of the column in the input data source containing users' logins in a form of email addresses (if the column with login is named differently in your input file, set up mapping)

The values of in the login column are case-sensitive and must be written in lowercase.
Once created, a login cannot be changed.

email_column

string

no

email

The name of the column in the input data source containing users' email addresses (if the column with emails is named differently in your input file, set up mapping)

If not provided, the email from the login is used instead.

The values of in the email column are case-sensitive and must be written in lowercase.

role_column

string

no

role

The name of the column in the input data source containing users' roles in the workspace (if the column with roles is named differently in your input file, set up mapping)

password_column

string

no

password

The column in the input data source containing users' passwords (if the column with passwords is named differently in your input file, set up mapping)

You do not have to provide passwords, and we do not recommend that you provide them. If not provided, the password is automatically generated when created for the first time, and either the user can change it later, or the password can be set with SSO.

sso_provider_column

string

no

sso_provider

The name of the column in the input data source containing the SSO provider (if the column with the SSO provider is named differently in your input file, set up mapping)

authentication_modes_column

string

no

authentication_modes

The name of the column in the input data source containing the users' authentication mode (if the column with the authentication modes is named differently in your input file, set up mapping)

authentication_modes

string

no

n/a

See authentication_modes.

sso_providerstringsee 'Description' columnn/a

(Use only when the 'authentication_modes' parameter is set to 'sso') The name of the SSO authentication provider or a comma-separated list of multiple SSO providers (for example, "sso_provider_1, sso_provider_2")

The 'sso_provider' parameter overrides any user-specific SSO provider settings that are set in the 'sso_provider' column.

do_not_touch_users_
that_are_not_mentioned
Booleannofalse

Defines whether to exclude from processing the users that are not explicitly specified in the input data.

  • If not set or set to 'false', any users that are not explicitly specified in the input data are deleted from the workspace.
  • If set to 'true', the users that are not explicitly specified in the input data are not affected and remain in the workspace.

Use this parameter when you add users to workspaces incrementally to avoid deleting the users that already exist in the workspaces.

whitelists

array

no

n/a

The 'whitelists' and 'regexp_whitelists' parameters define users to exclude from the processing.

Typically, in your workspace you have users that are there for business reasons. However, sometimes you would also have technical users (users deploying the ETL processes), users from vendors, and so on.

When updating the workspace, these non-business users will be deleted from the workspace unless explicitly specified in the input data. To avoid this, you can white-list users or classes of users who should be excluded from the process of adding and deleting users.

Example:

"whitelists": ["etl_admin@gooddata.com", "etl_tester@gooddata.com"]
"regexp_whitelists": ["etl.*@gooddata\\.com", "admin[0-9]+.*@gooddata\\.com"]

You must encode this parameter using the 'gd_encoded_params' parameter (see Specifying Complex Parameters).

NOTE: Avoid using these parameters or use them as little as possible. If you decide to exclude some users, you have to always remember what users are excluded in what workspaces and act accordingly when you update users in these workspaces. Having too many users excluded from processing may cause data inconsistency in your workspaces.

regexp_whitelists

array

no

n/a

ignore_failures

Boolean

no

false

Defines how the brick should behave in case of failures.

  • If not set or set to 'false', the brick will fail in case of any error in the input data (for example, a wrong role, a badly formatted email address, and so on).
  • If set to 'true', the brick will ignore any input data errors and will continue running.

We recommend that you set this parameter to 'false'. If you switch it to 'true', the data in your workspace may be inconsistent due to ignored errors.

REMOVE_USERS_FROM_PROJECTBooleannofalse

Defines whether to delete users from the workspace.

  • If not set or set to 'false', users are not impacted (not deleted from the workspace).
  • If set to 'true', users are deleted from the workspace.
data_productstringno

n/a

Attempts to default to the only available data product

The data product that contains the segments that you want to release

If the specified data product does not exist, it is created.

SEGMENTS_FILTERarraynon/a

(Use only when the 'sync_mode' parameter is set to 'sync_domain_client_workspaces') The segments that you want to synchronize

We recommend that you use this parameter to prevent all users from different segments from being deleted.

Example:

"SEGMENTS_FILTER": [ "BASIC", "PREMIUM"]

You must encode this parameter using the 'gd_encoded_params' parameter (see Specifying Complex Parameters).

sync_mode

The 'sync_mode' parameter specifies the synchronization mode for the users. You can choose from the following synchronization modes:

  • add_to_organization: Synchronize only the domain.

    • Brick deployment: service workspace (see How to Use a Brick)

    • Minimum required input data: user logins (the 'login' column is filled in). Missing information will be auto-populated. For more information, see Minimum Required Input Data.
      The input CSV file is de-duplicated by login, which may be duplicated because of the 'project_id' column, to allow you to use the same input CSV file for the other modes.

  • remove_from_organization: Remove any defined user login added by the 'add_to_organization' synchronization mode.
  • sync_project: Synchronize one workspace.

    • The users have to exist in the domain. If they do not, the brick will fail.

    • Brick deployment: synchronized workspace (see How to Use a Brick)

    • Minimum required input data: user logins and roles (the 'login' and 'role' columns are filled in). Missing information will be auto-populated. For more information, see Minimum Required Input Data.

  • sync_domain_and_project: Synchronize the domain and then the workspace.

    • Use this mode when you have only one workspace, and splitting the domain and workspace synchronization into two tasks (synchronizing the domain and synchronizing the workspace) is not efficient.

    • Brick deployment: synchronized workspace (see How to Use a Brick)

    • Minimum required input data: user logins and roles (the 'login' and 'role' columns are filled in). Missing information will be auto-populated. For more information, see Minimum Required Input Data.

  • sync_multiple_projects_based_on_pid: Synchronize multiple workspaces from the same input source using a single process.

    • Distributing users among workspaces is done based on workspace IDs (PIDs).

    • Use this mode when you have several workspaces, and synchronizing them one by one is time-consuming.

    • Brick deployment: service workspace (see How to Use a Brick)

    • Minimum required input data: user logins, user roles, and workspace IDs (the 'login', 'role', and 'project_id' columns are filled in). That is, the input data should define what user should go to what workspace. Based on workspace IDs, the input data is partitioned, and each partition is used to synchronize the appropriate workspace. Missing information will be auto-populated. For more information, see Minimum Required Input Data.


  • sync_multiple_projects_based_on_custom_id: This mode is similar to the 'sync_multiple_projects_based_on_pid' mode. The only difference is that the 'project_id' column in the input CSV file contains the client IDs instead of  workspace IDs.
  • sync_domain_client_workspaces: Fully synchronize the whole domain or specified segments.

    • This mode is fully declarative: any users or user filters that exist in the client workspaces but do not exist in the input data will be deleted from the workspaces. In other words, what is in the input data will be in the client workspaces, anything extra will be deleted.

    • To limit the synchronization scope to only specific segments, use the SEGMENTS_FILTER parameter (see Parameters). Any client workspaces outside the specified segments will not be touched.

    • To keep in the workspaces the users and user filters that are not explicitly specified in input data, set the 'do_not_touch_users_that_are_not_mentioned' to 'true' (see Parameters).

  • sync_one_project_based_on_pid: Synchronize one workspace from a single input source that may have input data for other workspaces, too. The brick will filter out the users for this particular workspace based on its ID (PID), and will ignore the rest of the data. To use this mode, you have to know the workspace ID.
    • Brick deployment: synchronized workspace (see How to Use a Brick)
    • Minimum required input data: user logins, user roles, and the workspace ID (the 'login', 'role', and 'project_id' columns are filled in). Missing information will be auto-populated. For more information, see Minimum Required Input Data.
  • sync_one_project_based_on_custom_id: Synchronize one workspace from a single input source that may have input data for other workspaces, too. The brick will filter out the users for this particular workspace based on its client_id (CID), and will ignore the rest of the data. However, you may not know the workspace ID (PID). Instead of the unknown PID, you are going to use an internal ID (called 'custom workspace ID').
    Generate an internal ID for the workspace. When the workspace is created, this custom ID is stored in the workspace metadata. This way, the PID (that you do not know) is mapped to the custom ID (that you have generated). By the custom ID, the brick will be able to identify the workspace and obtain its PID.
    • Brick deployment: synchronized workspace (see How to Use a Brick)
    • Minimum required input data: user logins, user roles, and the client ID (the 'login' and 'role' columns are filled in; the 'client_id' column contains the custom IDs (internal ID that you generated) or client IDs (CIDs)). Missing information will be auto-populated. For more information, see Minimum Required Input Data.


      Notice that there are three groups of processes differentiated by color. The advantage is that these processes do not have to be synchronized and can run at their own pace.
      • Red: You load the data. At some point, the data is picked up and put into storage. This data contains the custom ID that would allow for sorting the data without knowing in which workspace they would end up.
      • Yellow: At some point, the process responsible for maintaining workspaces and deploying them starts. The process identifies that a new workspace (Project 4) has to be spun up, so it spins it up. A part of this is deploying an ETL process and marking the deployed workspace with the custom ID.
      • Gray: At some point, the ETL starts and processes the data. If it runs, it means that the data for this workspace is already in the storage.
      Let's look at how the ETL will run:

      On the top, you can see datasets with data. There are two workspaces referenced there by custom IDs. All the other datasets use the custom IDs as a reference to the workspaces. Once the ETL starts, it accesses the data and processes it. One of the output objects will be a file that provides data about users in a particular workspace (bottom left).

authentication_modes

The 'authentication_modes' parameter specifies how users can access the GoodData platform. You can choose from the following authentication modes:

  • password: Users access the platform using their credentials.
  • sso: Users access the platform via SSO.

You can set up the authorization in the following ways:

  • Globally for all synchronized users: All users receive the same setting (password or SSO). This way, you do not have to specify authentication mode for each user and just set it globally for everybody in your brick parameters. When the 'authentication_modes' parameter is set, any user-specific authentication mode settings that are set in the 'authentication_modes' column will be ignored.
    You can specify one or several values. If you set 'authentication_modes' to 'sso', you must provide the value for the 'sso_provider' parameter (see Parameters). The 'sso_provider' parameter overrides any user-specific SSO provider settings that are set in the 'sso_provider' column.

    "authentication_modes": "password"
    "authentication_modes": ["password", "sso"]

    Parameters with a complex structure must be encoded with a special parameter called 'gd_encoded_params'. For more information, see Specifying Complex Parameters.

  • Per user setup driven by data: Each user has their own specific authentication mode.Your input data would look similar to this:

    loginfirst_namelast_nameauthentication_modes
    anna.doe@example.com

    Anna

    Doe

    password

    john.doe@example.com

    John

    Doe

    "password, sso"

    If you want to specify several values for authentication mode in your input data, put these values inside quotation marks and separate them by comma.

    In case of setting authentication mode per user, you do not specify the 'authentication_modes' parameter in your scheduled process. The brick will look into the schedule parameters first, will not find the globally set authentication mode, and will proceed looking for it in the input data.
    If you do set the 'authentication_modes' parameter, the process will take it as the first choice and will ignore any user-specific authentication mode settings.

Example: Brick Configuration

The following is an example of configuring the brick parameters in the JSON format:

{
  "organization": "myCustomDomain",
  "CLIENT_GDC_HOSTNAME": "analytics.myCustomDomain.com",
  "CLIENT_GDC_PROTOCOL": "https",
  "GDC_USERNAME": "domain_admin@myCustomDomain.com",
  "GDC_PASSWORD": "secret0",   // enter as a secure parameter
  "sync_mode": "sync_project",
  "do_not_touch_users_that_are_not_mentioned": true,
  "authentication_modes": "sso",
  "sso_provider": "ssoSAMLProvider",
  "gd_encoded_params": {
    "ads_client": {
      "username": "john.dow@myCustomDomain.com",
      "jdbc_url": "jdbc:gdc:datawarehouse://analytics.myCustomDomain.com/gdc/datawarehouse/instances/123456abcdef7890"
    },
    "input_source": {
      "type": "ads",
      "query": "SELECT * FROM domain_users"
    },
    "whitelists": ["etl_admin@myCustomDomain.com", "etl_tester@myCustomDomain.com"]
  },
  "gd_encoded_hidden_params": {
    "ads_client": {
      "password": "secret"
    }
  }
}