We assume that you have already learned what is described in:
If you want to find the right Writer for your purposes, see Writers Comparison.
CSVWriter is a terminative component that writes data to flat files.
Component | Data output | Input ports | Output ports | Transformation | Transf. required | Java | CTL |
---|---|---|---|---|---|---|---|
CSVWriter | flat file | 1 | 0-1 |
CSVWriter formats all records from the input port to delimited, fixed-length, or mixed form and writes them to specified flat file(s), such as CSV (comma-separated values) or text file(s). The output data can be stored localy or uploaded via a remote transfer protocol. Also writing ZIP and TAR archives is supported.
The component can write a single file or partitioned collection of files. The type of formatting is specified in metadata for the input port data flow.
Port type | Number | Required | Description | Metadata |
---|---|---|---|---|
Input | 0 | for received data records | any | |
Output | 0 | for port writing. See Writing to Output Port. | include specific byte /
cbyte / string field |
Attribute | Req | Description | Possible values |
---|---|---|---|
Basic | |||
File URL | where the received data to be written (flat file, console, output port, dictionary) specified, see Supported File URL Formats for Writers. | ||
Charset | character encoding of records written to the output | ISO-8859-1 (default) | <other encodings> | |
Append | If records are printed into an existing non-empty file,
they replace the older ones by default (false ). If
set to true , new records are appended to the end
of the existing output file(s) content. | false (default) | true | |
Quoted strings | if switched to true , all data field values except from
byte and cbyte will be double quoted | false (default) | true | |
Quote character | Specifies which kind of quotes will be permitted in Quoted strings. | both (default) | " | ' | |
Advanced | |||
Create directories | if set to true , non-existing directories in the
File URL attribute path are created | false (default) | true | |
Write field names | Field labels are not written to the output
file(s) by default. If set to true , labels of individual fields
are printed to the output.
Please note field labels differ from field names: labels can be duplicate and
you can use any character
in them (e.g. accents, diacritics). See
Record Pane.
| false (default) | true | |
Records per file | Maximum number of records to be written to each output file. If specified, the dollar sign(s) $ (number of digits placeholder) must be part of the file name mask, see Supported File URL Formats for Writers | 1 - N | |
Bytes per file | Maximum size of each output file in bytes. If specified, the dollar sign(s) $ (number of digits placeholder) must be part of the file name mask, see Supported File URL Formats for Writers To avoid splitting a record into two files, max size can be slightly overreached. | 1 - N | |
Number of skipped records | how many records/rows to be skipped before writting the first record to the output file, see Selecting Output Records. | 0 (default) - N | |
Max number of records | how many records/rows to be written to all output files, see Selecting Output Records. | 0-N | |
Exclude fields | Sequence of field names separated by semicolon that will not be written to the output. Can be used when the same fields serve as a part of Partition key. | ||
Partition key | [ 2)] | sequence of field names separated by semicolon defining the records distribution into different output files - records with the same Partition key are written to the same output file. According to the selected Partition file tag use the proper placeholder ($ or #) in the file name mask, see Partitioning Output into Different Output Files | |
Partition lookup table | [ 1)] | ID of lookup table serving for selecting records that should be written to output file(s). See Partitioning Output into Different Output Files for more information. | |
Partition file tag | [ 2)] | By default, output files are numbered. If it is set to
Key file tag , output files are named
according to the values of Partition key
or Partition output fields. See Partitioning Output into Different Output Files for more
information. | Number file tag (default) | Key file tag |
Partition output fields | [ 1)] | Fields of Partition lookup table whose values serve to name output file(s). See Partitioning Output into Different Output Files for more information. | |
Partition unassigned file name | Name of the file into which the unassigned records should be written if there are any. If not specified, data records whose key values are not contained in Partition lookup table are discarded. See Partitioning Output into Different Output Files for more information. | ||
[ 2)] Either both or neither of these attributes must be specified [ 1)] Either both or neither of these attributes must be specified |
Field size limitation 1: CSVWriter can write fields of a size up to 4kB. To enable bigger fields to be written into a file, increase the DataFormatter.FIELD_BUFFER_LENGTH property, see Changing Default CloudConnect Settings. Enlarging this buffer does not cause any significant increase of the graph memory consumption.
Field size limitation 2: Another way how to solve the big-fields-to-be-written issue is the utilization of the Normalizer component that can split large fields into several records.