This topic is part of the tutorial that is intended for developers who plan to work on data models and ETL using CloudConnect.
If you are interested in creating reports, dashboards, or metrics, start here.
Before diving into the tutorial, let’s start with a brief look at how the various components of GoodData’s platform work together.
GoodData’s cloud-based, multi-tenant platform can deliver reports on data from various sources, while allowing users to carry out ad-hoc analysis to answer pressing business questions. The sections below provide a brief conceptual overview of the fundamental elements of the platform.
The GoodData Project
In GoodData, the basic unit of development is the GoodData project, which consists of user-defined metrics, reports, and dashboards, as well as the underlying technical objects that gather and store the project data. For those familiar with standard business intelligence architecture, a GoodData project corresponds to the concept of a data mart. The diagram below illustrates the GoodData project hierarchy:
The foundation of the project is the data that is uploaded to it. Raw data is stored as collections of facts and attributes:
- A fact is a quantitative value, or numerical record, that can be used as an input for the metrics that can be defined in GoodData.
- An attribute is a qualitative text or numerical value that can be used to slice or segment your reports. For example, Browser Version or US Region values are stored in attributes, which serve as categories for segmenting report values.
In GoodData, the numerical values that appear in reports (i.e. charts and tables) are metrics. Metrics “summarize” individual numerical records by aggregating individual fact values with functions such as SUM (sum), MAX (maximum value), MIN (minimum value), and AVG (mean).
In some cases, attribute values can also be aggregated into metrics that may be displayed in a report. (For example, the COUNT aggregation function counts the number of unique values an attribute, like Region has.)
More info: Key Terminology
Once one or more metrics have been added to a report, they can be sliced, or broken down and categorized by, attributes, e.g. Opportunity Stage.
Filters may also be used to limit the data that goes into metric computations for a given report, e.g. Year is 2012 and Quarter is Q2. (Note: In the case of dashboard filters, users can limit the data that goes into metric computations across an entire dashboard.)
GoodData supports a variety of report types and many configuration options for each type.
More info: Chart Types
Groups of reports may be organized on dashboard tabs, each of which is typically designed to satisfy a specific use case. A collection of dashboard tabs is a dashboard, which often addresses an individual business process.
- A project may contain one or more dashboards, each of which may contain one or more tabs. Each tab may contain multiple reports or other objects such as widgets, text, dashboard filters, or embedded web content.
- For security reasons, reporting objects in one project cannot be shared with another project. However, you can export and import reporting objects using the GoodData APIs.
- Once users are invited to a project, they can collaborate over the project’s contents with other users. It is also possible to restrict certain users’ access to certain areas of a project.
Creating a Project
Creating a basic GoodData project involves data modeling, data loading, project administration, and dashboard creation.
Data modeling and data loading is done using GoodData’s data loading service, CloudConnect Designer, a desktop application in which you can build graphical representations of your data models and ETL graphs and then publish them to your GoodData projects.
- A data model defines the relationships between facts and attributes in your project. When data is loaded into the project or queried by the users of the project, the data model is used for storing and retrieving data from the underlying database.
- An ETL graph is a graphical representation of an ETL process, which extracts data from an enterprise data source, transforms it for use, and loads it into the designated GoodData project. These processes are built locally in CloudConnect and then published into the GoodData platform, where they can be scheduled and managed.
In GoodData, developers need only build the logical data model through CloudConnect’s graphical interface. The GoodData platform automatically builds the underlying physical data model, which defines the tables in the database to store the data.
So, to build a basic project, you must create each of the following:
- data model
- ETL process(es)
Developers may need to create or interact with other components of the GoodData platform, such as the GoodData APIs. These components are not described in this basic tutorial.
Supplying Data to a Project
In the diagram below, you can see how data is supplied to the GoodData platform and the objects that must be created in the CloudConnect platform in order to integrate data into your projects.
In the lower half of the diagram, you can see the objects that are created in a CloudConnect project, including ETL graphs as well as a logical data model. When published, these items are copied to the GoodData platform and associated with a designated GoodData project. After publication, you can schedule automated ETL processes to refresh your project’s data on a regular basis. This will be explored later in this tutorial.
It’s time to start building. At any time in the tutorial ahead you can refer to Key Terminology for support.
Read next: Setup