Recap of our Community Meetup #2: Streaming Spatial Data

On November 30, we ran our second community meetup with a guest talk on streaming spatial data.

By Jonas Zech

In our bimonthly community meetup, we connect with our users and extended community to chat about all things data.

Olivier, responsible for Partnerships and Community at DataCater, moderated the meetup and our CTO Hakan shared insights about the future roadmap with a focus on the features planned for December. This time we decided to have only one speaker, but Dr. Nawar Halabi (Data/ML engineer at ING) took the challenge and delivered a great presentation.

Product roadmap (Hakan Lofcali)

After Olivier led us through an introduction round, Hakan presented our product roadmap for December. The YAML-based declaration of data pipelines, also called “Pipelines as Code”, he presented in the first community meetup will finally ship in December! Also we will release an open source Helm Chart repository to ease the cloud-native deployment of DataCater. Moreover, we will improve the audit logs. To welcome feedback from our community we made our roadmap public. You might even let us know which features we should prioritize.

Streaming Spatial Data (Dr. Nawar Halabi)

Nawar developed a tool to approach the following problem: Flight noise has an impact on both life quality and real estate prices. However, there is no noise scoring service with a good user experience. Nawar’s tool named Noise Map uses real-time traffic data from planes to calculate location-specific noise scores. It enables everyone to check the noise level at any location. Technically, he extracts real-time traffic data from two data providers, prepares, filters them, and loads them into PostGIS with DataCater. PostGIS is a PostgreSQL add-on that adds spatial indexes to enable faster queries. Nawar uses it as a data sink with DataCater and positively mentioned the reliability of DataCater and the reporting of errors with emails which helps him to run his pipelines without a headache.

The biggest challenge with huge (spatial) databases is to make the data indexing fast enough to ensure a good user experience while keeping the server load moderate. Therefore Nawar did a deep-dive in h3, a library by Uber for spatial indexing which is more efficient than other indexing methods.

Summary

We want to thank everyone who participated and hope to welcome you again in the next year. Special thanks to Nawar, Olivier, and Hakan for their great presentations!

If you missed the meetup I suggest you watch the recording on our YouTube channel:

Please follow us on LinkedIn to keep posted on future community events!