Delete Old Data while Loading New Data via CloudConnect

CloudConnect is a legacy tool and will be discontinued. We recommend that to prepare your data you use the GoodData data pipeline as described in Data Preparation and Distribution.

When setting up your CloudConnect graph to upload data to a dataset, you can configure it so that some old data are deleted from the same or different dataset. To do so, add a GD Dataset Deleter component to your graph.

Contents:

While you can upload new data by any label of an attribute, deleting the old data can be done only by the primary attribute label.

The GD Dataset Deleter component works only when all the following conditions are met:

If either or both of these conditions are not met, data loading will fail.

You can use GD Dataset Deleter separately to only delete some data from a dataset, without uploading any new data.

GD Dataset Deleter

We assume that you have already learned what is described in:

Summary

GD Dataset Deleter deletes the old data in one transaction with a data upload that is performed using GD Dataset Writer. This helps keep data in your project consistent and avoid situations when new data is already in the dataset but the old data are not yet removed.

Icon

Ports

Port typeNumberRequiredDescriptionMetadata
Input0yesFor deleted data recordsAny

This component has one input port and no output ports.

The following picture shows GD Dataset Deleter attributes.

When you select this component, you must specify a GoodData project and the dataset from which the data will be deleted. The component takes the current GoodData project by default (the project hash is stored in the GDC_PROJECT_ID parameter).

The following picture shows the dialog for choosing the target dataset.

The most important attribute of GD Dataset Deleter is Field mapping that defines how the input metadata fields map to the GoodData dataset columns (attribute and facts).

When you define mapping, you choose to delete data records either by their primary keys or fact table grain, or by their attributes or references. To set up mapping, select an input field for each dataset filed from the drop-downs. You can also set up mapping for referenced datasets and date dimensions: for a date filed, specify the corresponding date dimension.

The following picture shows the mapping dialog when you choose to delete data records by their primary keys or fact table grain.

The following picture shows the mapping dialog when you choose to delete data records by their attributes or references.

GD Dataset Deleter Attributes

AttributeReqDescriptionPossible values
BASIC
GoodData project IDyesSpecifies the GoodData project where the target dataset resides.

The current project (the project's hash in the GDC_PROJECT_ID parameter) is used by default.

Data setyesThe target dataset from which the data will be deleted.
Field mappingyesMapping of the input fields to the dataset columns.
ADVANCED
Empty input thresholdyes

Max. retry attemptsyesThe maximum number of retries that will be attempted if the previous attempts failed.The default is 5.
Pause between retriesyesThe delay between individual retries (in seconds).The default is 60.