XML Files

DataCater supports Extensible Markup Language (XML) files as a data source. Users can upload XML files using the UI of DataCater. DataCater takes care of parsing the XML files and publishing the extracted records to a data pipeline.

When parsing an XML file, DataCater tries to extract an array of record nodes. You should provide the path to the record nodes using the configuration option Path to record node.

Once a data pipeline has been built for a certain XML file, users can upload additional files with the same structure and attributes.


Configuration

This source connector supports the following configuration options:

Path to record node

The path to the XML nodes holding information about records (default: /record).

Example 1: You may use the value /record for extracting records from the following XML file:

<?xml version="1.0" encoding="UTF-8"?>
<records count="2" type="array">
  <record>
    <id>1</id>
    <name>Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems</name>
  </record>
  <record>
    <id>2</id>
    <name>Functional Programming in Scala</name>
  </record>
</records>

Example 2: You may use the configuration value /container/record for extracting records from the following XML file:

<?xml version="1.0" encoding="UTF-8"?>
<records count="2" type="array">
  <container>
    <record>
      <id>1</id>
      <name>Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems</name>
    </record>
    <record>
      <id>2</id>
      <name>Functional Programming in Scala</name>
    </record>
  </container>
</records>

Data Types

By default, DataCater parses all attribute values as strings.

DataCater automatically extends the set of attributes with the attribute __datacater_file_name and fills it with the name of the uploaded JSON file.