Create a Logical Data Model from Your Cloud Object Storage Service
For workspace administrators only
This tutorial guides you through the process of creating a logical data model (LDM) in your workspace using files in your object storage service (for example, S3 or Azure Blob Storage). A newly created workspace does not have an LDM therefore you are going to create an LDM from scratch.
You create an LDM in the LDM Modeler. To do so, perform the following steps:
- Create a Data Source.
- Download the files from the object storage service.
- Add datasets.
- Publish the LDM.
You can also create an LDM:
- Manually (see Create a Logical Data Model Manually)
- From CSV files with data (see Create a Logical Data Model from CSV Files)
- From a cloud data warehouse (see Create a Logical Data Model from Your Cloud Data Warehouse)
- From the Output Stage if you use it (see Create a Logical Data Model from the Output Stage)
When you are working on your LDM:
- The changes are automatically saved as a draft as you are making them. The draft is saved under your GoodData user, on the machine and in the browser it was created, and you can continue editing it whenever you are ready. When the LDM Modeler saves your draft, it displays a message and the time the draft was last saved. The draft is kept until you either publish the changes to the workspace or manually discard the draft (in this case, the last published version of the LDM is loaded in the LDM Modeler).
- The LDM is validated as you are making changes. You are warned you if there is an issue (for example, an empty dataset or a dataset with a missing title).
Create a Data Source
Create a Data Source for the object storage service that holds the files with data that you want to upload to your workspaces. For more information, see Create a Data Source.
During data load, the GoodData platform connects to the object storage service using the information from the Data Source, downloads the data from the files stored there, and uploads it to the workspaces according to how the datasets in the LDM are mapped to the files.
Download the Files from the Object Storage Service
Download the files from the object storage service to your computer.
While you can upload data from the files of various formats, you need CSV files to create the LDM. Therefore, if your files are not in the CSV format, convert them to CSV files to be able to create the LDM. For data loading, use the files in their original format that are stored in your object storage service. For the data loading to be successful, make sure that the structure of the files in your object storage service matches the structure of the CSV files that you used to create to the LDM.
For the supported file formats for your object storage service, see the following articles:
Add Datasets
You are now going to create datasets by importing the CSV files to the LDM Modeler.
When a CSV file is being imported, the LDM Modeler tries to auto-detect the types of the data in the file. The data can be detected as one of the following:
- Fact, a numerical piece of data, which in a business environment is used to measure a business process (see Facts in Logical Data Models)
- Attribute, data that is to be used in grouping or segmenting the values resulting from the computed functions (see Attributes in Logical Data Models)
- Primary key, an attribute that serves as a unique identifier for a row of data in a file and as a connection point that allows you to connect this dataset to another dataset (see Connection Points in Logical Data Models)
- Reference, a connection point (foreign key) from another dataset (see Connection Points in Logical Data Models)
- Date, data representing dates Dates are managed through a separate object, the Date dataset (see Dates in Logical Data Models). If you are importing a CSV file that contains dates, not one but two datasets will be added to your LDM: one Date dataset for the dates, and the other one with the rest of information from the CSV file. These two datasets will be automatically connected with a relationship, and the Date dataset will become a reference in that other dataset.
Steps:
On the top navigation bar, select Data. The LDM Modeler opens. You see a blank canvas area in view mode.
Click Edit. The LDM Modeler is switched to edit mode.
To add a dataset, drag a CSV file and drop it in the blank canvas area. The data preview opens. The preview shows the data from the file and looks similar to the following:
Based on the column names and the contents of those columns in the file, the column names and the types of the data are suggested for the dataset that will be created from this file.Review the suggested column names and the data types. Update them if needed. For more information about the data types and how to set them correctly, see Create a Logical Data Model from CSV Files.
Once done, click Import. The file is imported, and the dataset is added to the LDM Modeler. Every column in this dataset is mapped to the appropriate column in the file. During data load, the data from a column in the file will be loaded to the corresponding fact or attribute in the dataset. For more information about the mapping, see Mapping between a Logical Data Model and the Data Source.
Repeat Steps 3-5 to add more datasets. For more information about how to add different types of datasets to the LDM, see Create a Logical Data Model from CSV Files.
Update the LDM as needed: add relationships between datasets, modify attributes or facts in the datasets, and so on. For more information, see Update a Logical Data Model.
If you add a dataset manually or add an attribute or a fact to a dataset, make sure to map the newly added elements to the corresponding source columns in the CSV files. For more information about the mapping, see Mapping between a Logical Data Model and the Data Source.
Your LDM is ready. You can now publish it.
Publish the LDM
To publish the LDM, follow the instructions from Publish a Logical Data Model.
Always keep the LDM synchronized with the source of the data. Whenever you change the source of the data, update the LDM accordingly.
For example, if you add a column to a source table, add a corresponding field (attribute or fact) to the dataset mapped to this table, and then map this field to the table column. Otherwise, you will not be able to load data from this column to your workspaces.
Similarly, if you delete a column from a source table or delete a whole table, delete the corresponding field from the mapped dataset or the mapped dataset itself.
For more information about updating the LDM, see Update a Logical Data Model.