Snowflake

DataCater can stream insertions, updates, and deletions processed by a data pipeline to Snowflake.


Preparing Snowflake

Please complete the following steps to prepare Snowflake for the usage with DataCater.

Create Snowflake user for DataCater

We recommend that you don't use an existing Snowflake user but create a separate one for DataCater. Please use the Snowflake UI to create a new user with, for instance, the name datacater_user.

Create Snowflake role and prepare permissions

Run the following script to create a custom role for DataCater, prepare the permissions required to create Snowflake objects, and grant the role to the user created in the previous step. Please make sure to replace the placeholders [database_name] and [schema_name], and adapt the user and role name if needed.

-- Use a role that can manage roles and privileges.
USE ROLE securityadmin;

-- Create role
CREATE ROLE datacater_role;

-- Grant privileges on the database
GRANT USAGE ON DATABASE [database_name] TO ROLE datacater_role;

-- Grant privileges on the schema
GRANT USAGE ON SCHEMA [schema_name] TO ROLE datacater_role;
GRANT CREATE TABLE ON SCHEMA [schema_name] TO ROLE datacater_role;
GRANT CREATE STAGE ON SCHEMA [schema_name] TO ROLE datacater_role;
GRANT CREATE PIPE ON SCHEMA [schema_name] TO ROLE datacater_role;

-- Grant role to user
GRANT ROLE datacater_role TO USER datacater_user;
-- Assign role as default role for the user
ALTER USER datacater_user SET DEFAULT_ROLE = datacater_role;
Set up key pair authentication

DataCater uses key pair authentication for authenticating with Snowflake.

In the first step, please create a new private key on the command line of your local computer:

> openssl genrsa -out rsa_key.pem 2048

In the second step, please derive a public key from the private key:

> openssl rsa -in rsa_key.pem -pubout > rsa_key.pub

In the last step, use a Snowflake Worksheet to assign the public key, stored in the file rsa_key.pub, to the DataCater user. Please leave out the begin and end delimiters of the public key, i.e., -----BEGIN PUBLIC KEY----- and -----END PUBLIC KEY-----, when providing the public key in the SQL statement.

ALTER USER datacater_user SET RSA_PUBLIC_KEY='[public_key_without_delimiters]';

Configuration

This sink connector supports the following configuration options:

Snowflake account URL

The URL of your Snowflake account, typically following the format https://[account_name].[region].snowflakecomputing.com.

Username

The name of the Snowflake user to use for authentication. If you strictly followed our guide on preparing Snowflake, provide datacater_user here.

Private key

The content of the private key file (rsa_key.pem) you generated when preparing Snowflake.

Database name

The name of the database to use for ingesting data change events.

Schema name

The name of the schema to use for ingesting data change events (default: public).

Table name

The name of the table to use for ingesting data change events. Please don't provide an existing table here. If you leave this option empty, we will automatically create a table with the name datacater_pipeline_[pipeline_id] where [pipeline_id] is replaced by the id of the pipeline inserting data into Snowflake.


Data Schema

When inserting data into Snowflake, DataCater uses the following schema for the Snowflake table.

The first column, RECORD_METADATA, holds metadata about the change event, e.g., the time it was inserted into Snowflake, or the primary key of the related data record. The other column, RECORD_VALUE, holds the entire change event as a JSON object. Both columns, RECORD_METADATA and RECORD_VALUE, are stored in Snowflake as columns of type VARIANT.