DataCater can stream insertions, updates, and deletions processed by a data pipeline to Snowflake.
Please complete the following steps to prepare Snowflake for the usage with DataCater.
We recommend that you don't use an existing Snowflake user but create a separate one for DataCater. Please use the Snowflake UI to create a new user with, for instance, the name datacater_user.
Run the following script to create a custom role for DataCater, prepare the permissions required to create Snowflake objects, and grant the role to the user created in the previous step. Please make sure to replace the placeholders [database_name] and [schema_name], and adapt the user and role name if needed.
-- Use a role that can manage roles and privileges. USE ROLE securityadmin; -- Create role CREATE ROLE datacater_role; -- Grant privileges on the database GRANT USAGE ON DATABASE [database_name] TO ROLE datacater_role; -- Grant privileges on the schema GRANT USAGE ON SCHEMA [schema_name] TO ROLE datacater_role; GRANT CREATE TABLE ON SCHEMA [schema_name] TO ROLE datacater_role; GRANT CREATE STAGE ON SCHEMA [schema_name] TO ROLE datacater_role; GRANT CREATE PIPE ON SCHEMA [schema_name] TO ROLE datacater_role; -- Grant role to user GRANT ROLE datacater_role TO USER datacater_user; -- Assign role as default role for the user ALTER USER datacater_user SET DEFAULT_ROLE = datacater_role;
DataCater uses key pair authentication for authenticating with Snowflake.
In the first step, please create a new private key on the command line of your local computer:
> openssl genrsa -out rsa_key.pem 2048
In the second step, please derive a public key from the private key:
> openssl rsa -in rsa_key.pem -pubout > rsa_key.pub
In the last step, use a Snowflake Worksheet to assign the public key, stored in the file rsa_key.pub, to the DataCater user. Please leave out the begin and end delimiters of the public key, i.e., -----BEGIN PUBLIC KEY----- and -----END PUBLIC KEY-----, when providing the public key in the SQL statement.
ALTER USER datacater_user SET RSA_PUBLIC_KEY='[public_key_without_delimiters]';
This sink connector supports the following configuration options:
The URL of your Snowflake account, typically following the format https://[account_name].[region].snowflakecomputing.com.
The name of the Snowflake user to use for authentication. If you strictly followed our guide on preparing Snowflake, provide datacater_user here.
The content of the private key file (rsa_key.pem) you generated when preparing Snowflake.
The name of the database to use for ingesting data change events.
The name of the schema to use for ingesting data change events (default: public).
The name of the table to use for ingesting data change events. Please don't provide an existing table here. If you leave this option empty, we will automatically create a table with the name datacater_pipeline_[pipeline_id] where [pipeline_id] is replaced by the id of the pipeline inserting data into Snowflake.
When inserting data into Snowflake, DataCater uses the following schema for the Snowflake table.
The first column, RECORD_METADATA, holds metadata about the change event, e.g., the time it was inserted into Snowflake, or the primary key of the related data record. The other column, RECORD_VALUE, holds the entire change event as a JSON object. Both columns, RECORD_METADATA and RECORD_VALUE, are stored in Snowflake as columns of type VARIANT.