Python Transforms | DataCater

Simple yet powerful transformations for streaming ETL

In DataCater, pipelines consist of a sequence of filters and transforms that are applied to the data while streaming them from sources to sinks. By combining different filters and transforms in a pipeline, you can implement almost any data preparation requirement. DataCater provides an extendable set of pre-defined filters and transforms that are useful for common tasks in operational data pipelines. Additionally, you can implement custom requirements with inline functions in the pipeline. In DataCater, both pre-defined as well as inline functions are defined in Python.

Please see the following demo where we use an inline Python transform to mask email addresses:

Interactive previews in the Pipeline Designer

DataCater's Pipeline Designer offers full support for previewing the results custom filters and transforms. This does not only help engineers to detect accidental side effects as early as possible and debug the performance of their transformations but also enables non-technical users to observe the output of Python functions, interactively validate their behavior, and even combine them with pre-defined functions.

Use the Python (modules) you know

DataCater ships the upstream version of CPython 3. If you already have basic knowledge of Python programming, you can get started building custom transformations instantly. DataCater offers access to all core features of Python and makes the modules of the Python Standard Library, for instance, for processing JSON or XML structures, available. If you are using DataCater Self-Managed, you are able to install and use almost any Python module with DataCater.

Python® Transforms