CSV Files

DataCater supports comma-separated values (CSV) files as a data source. Users can upload CSV files using the UI of DataCater. DataCater takes care of parsing the CSV files and publishing the extracted records to a data pipeline.

Once a data pipeline has been built for a certain CSV file, users can upload additional files of the same structure, i.e, files with the same number of values (or columns), which are then instantly processed by the pipeline.

When uploading a CSV file to create a new data source, DataCater tries to automatically detect parser settings, like the used delimiter character. If needed, the configuration used for parsing the CSV file can be adjusted (see below).


Configuration

This source connector supports the following configuration options:

Column delimiter

The delimiter character that is used to separate different values (or columns) from each other (default: ,).

Row delimiter

The delimiter characters that are used to separate different rows (or record) from each other (default: \r\n).

Does the first row provide the attribute names?

If the first row of the uploaded CSV files holds the names of the different attributes (or columns), we may skip it for reading in data (default: yes).

Number of rows to strip from the beginning

If the data part starts after multiple rows, we may skip the first n rows for reading in data (default: 0). This is often the case for CSV files that were generated by a spreadsheet application.

Escape character for special characters

The character used to escape special characters in the CSV file (leave empty, if none is used, which is the default).


Data Types

DataCater imports all columns of a CSV file as attributes of type string.

DataCater automatically extends the set of attributes with the attribute __datacater_file_name and fills it with the name of the uploaded CSV file.