Use RFC 4180 Compliant CSV Files for Upload
CloudConnect is a legacy tool and will be discontinued. We recommend that to prepare your data you use the GoodData data pipeline as described in Data Preparation and Distribution. For data modeling, see Data Modeling in GoodData to learn how to work with Logical Data Modeler.
By default, the COPY command expects delimited data even if the delimiter character is not present inside individual data fields.
In the RFC 4180 document, the CSV format describes an encoding structure with a delimiter, double quotes, or even newline characters within data fields.
The following example is a valid CSV file with a header line and a single data record:
product_id,product_name,product_description,product_price
12345,"1"" by 5 Yards Duct Tape","Great choice for your creative projects
Super performance strength
Available in white, red, green and black",9.95
This CSV file looks like the following in a spreadsheet application:
product_id | product_name | product_description | product_price |
---|---|---|---|
12345 | 1" by 5 Yards Duct Tape | Great choice for your creative projects Super performance strength Available in white, red, green and black | 9.95 |
To load CSV data with all escaping possibilities defined in RFC 4180, explicitly specify the CSV parser using WITH PARSER GdcCsvParser
, which is a GoodData-specific CSV parser in Data Warehouse.
For the recommendations on how to choose a parser for data upload, see Choose a Data Warehouse Parser.
Example:
COPY customers FROM LOCAL 'customers.csv.gz' GZIP WITH PARSER GdcCsvParser
To load CSV data with all escaped characters, as specified in RFC 4180, explicitly specify the CSV parser using the GdcCsvParser, a GoodData-specific parser for Data Warehouse, and include the escape character using ESCAPE AS
:
COPY customers FROM LOCAL 'customers.csv.gz' GZIP
WITH PARSER GdcCsvParser ESCAPE AS '"'