DataCater 2023.2 is here

We are happy to announce the newest open-core release of DataCater, 2023.2, which introduces the Config resource and implements a lot of user feedback.

...
By Stefan Sprenger

Dear DataCater users,

Two months after our first open-core release, we are excited to announce the next iteration of our open core: DataCater 2023.2.

This release introduces a new resource, Configs, which allows you to outsource common configurations of other resources such as Streams or Deployments. Additionally, we have implemented lots of feedback from early users of our open-core product and fixed a couple of bugs.

Manage your resource configuration at one central location with Configs

Configs take the concept of Kubernetes’ ConfigMaps and make them available for the development of streaming data pipelines. With Configs, you can easily manage your configurations in a centralized location and reuse them across multiple resources. This makes it easier to maintain your resources and avoid duplication of configuration settings.

Let us have a look at a practical example and assume that we want to connect DataCater with four different Apache Kafka topics from the same cluster.

Without Configs, we would need to specify the same connection credentials, i.e., the bootstrap.servers and probably some auth-related settings, for every stream object individually. Whenever a connection-related property changes, we would need to update all related streams. Such duplicates in the configuration of resources like Streams do not only require a lot of time when creating the objects but are also painful to maintain.

With Configs, we can outsource common resource configuration to a Config object. All we need to do is specify the kind of the Config (at the moment, we support STREAM and DEPLOYMENT in the open core), define the resource configuration under spec, and assign one or multiple labels that can be used for referencing this config object:

$ curl http://localhost:8080/api/v1/configs \
  -H'Authorization: Bearer ACCESS_TOKEN' \
  -XPOST \
  -H'Content-Type:application/json' \
  -d'{"kind":"STREAM","metadata":{"labels":{"io.datacater/name":"local-kafka-cluster"}},"spec":{"kafka":{"bootstrap.servers":"localhost:9092"}},"name":"Local Kafka cluster"}'

You can reference Configs in Streams and Deployments using the configSelector property. Let us have a look at creating a Stream object that references the Config we created in the prior example using its label io.datacater/name=local-kafka-cluster:

$ curl http://localhost:8080/api/v1/streams \
  -H'Authorization: Bearer ACCESS_TOKEN' \
  -XPOST \
  -H'Content-Type:application/json' \
  -d'{"configSelector":{"io.datacater/name":"local-kafka-cluster"},"name":"example-topic", "spec":{"kind":"KAFKA","kafka":{"topic":{"config":{}}}}}'

Note that Streams and Deployments can overwrite the properties of Configs.

Community feedback from early open-core users

Additionally, we implemented a lot of feedback that we received from early users of our open core:

  • Allow configuration of the number of replicas for individual Deployments (#178).

  • Support resizing the code editor (#185).

  • Improve the Helm chart and provide a better experience when getting started (#145).

  • Finetune the configuration of SmallRye for high-throughput scenarios (#148).

  • Support multiple techniques for sampling streams (#158).

Chore and bug fixes

  • Update Quarkus to the latest release (#167).

  • Clean up the format response of the deployment endpoints by removing information on the status of the Kubernetes deployment; we now ship /health endpoints (#194).

  • Fix previewing tombstone records (#156).

  • Fix a bug that overwrote the field id when inspecting a stream (#149).

  • Improve error reporting when failing to inspect Streams (#184).