MySQL

The source connector for MySQL can extract data change events from a MySQL database system in real-time and stream them to the DataCater platform.

The MySQL source connector is available in two variants:

  • The Binlog-based connector implements change data capture by extracting change events from the Binary Log of MySQL. It is based on the Debezium MySQL connector. It can extract INSERTs, UPDATEs, and DELETEs.
  • The JDBC-based connector runs queries against the MySQL database to extract change events. It can extract INSERTs and UPDATEs.

Requirements

The Binlog-based connector uses row-based binary logging, which is available since MySQL version 5.1.5.


Preparing MySQL (Binlog-based connector)

Setup user

Create a new MySQL user and assign the required permissions (make sure that you allow the created user to sign in from only the IP address of the machine where DataCater is running):

mysql> CREATE USER '[username]'@'[ip_of_datacater_installation]' IDENTIFIED BY '[password]';
mysql> GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO '[username]'@'[ip_of_datacater_installation]' IDENTIFIED BY '[password]';
Enable Binary Log

Add the following lines to the config file of your MySQL installation to enable the Binary Log:

server-id         = 42
log_bin           = mysql-bin
binlog_format     = ROW
expire_logs_days  = 10
# define `binlog_row_image` for MySQL 5.6 or higher, leave it out for earlier releases
binlog_row_image  = FULL

Preparing MySQL (JDBC-based connector)

Setup user

Create a new MySQL user and assign the required permissions (make sure that you allow the created user to sign in from only the IP address of the machine where DataCater is running):

mysql> CREATE USER '[username]'@'[ip_of_datacater_installation]' IDENTIFIED BY '[password]';
mysql> GRANT SELECT, RELOAD, SHOW DATABASES ON *.* TO '[username]'@'[ip_of_datacater_installation]' IDENTIFIED BY '[password]';

Configuration

This source connector supports the following configuration options:

General > Hostname or IP

The hostname or IP address of the machine where the MySQL database system is running.

General > Port

The port under which MySQL is available (default: 3306).

General > SSL

Whether to use SSL when connecting to the MySQL database or not.

General > Username

The username used for authenticating with MySQL.

General > Password

The password of the user used for authenticating with MySQL.

General > Server timezone

Timezone of the MySQL server (default: UTC).

General > Database name

The name of the database that should be used as a source.

General > Table name

The name of the database table that should be used as a source. You may retrieve the list of tables available in the given MySQL database by clicking on Fetch table names.

General > Automatically detect primary key column

Whether to automatically profile the primary key column using the MySQL table information_schema.key_column_usage or not.

General > Primary key column

Only available if the automated detection of the primary key column is disabled. Name of the attribute that uniquely identifies records. DataCater uses the primary key attribute to detect new records. Please make sure that the column does not hold NULL values.

Change Data Capture > Connector variant

You may choose between the Binlog-based and JDBC-based connector variant.

Change Data Capture > Change Data Capture mode

Only available for the JDBC-based connector.

You may choose one of the following modes for change data capture:

  • BULK: Fetch all data at each sync.
  • INCREMENTING: Use the primary key column to fetch data that have been inserted since the last sync. This mode does only extract INSERTs, but skips UPDATEs and DELETEs.
  • TIMESTAMP/INCREMENTING: Use the primary key column and the timestamp column, specified using the Timestamp column configuration options, to fetch data that have been inserted or updated since the last sync. This mode does only extract INSERTs and UPDATEs, but skips DELETEs.
Change Data Capture > Timestamp column

Only available for the JDBC-based connector.

DataCater can use a timestamp column, which stores the time of the most recent update of a record, to detect record updates. Specifying the timestamp column is required when using TIMESTAMP/INCREMENTING as Change Data Capture mode.

Change Data Capture > Sync interval

Only available for the JDBC-based connector.

The interval in seconds between synchronizations of the MySQL table and DataCater (default: 60).


Data Types

The following table shows the mapping between MySQL data types and the data types used by DataCater.

MySQL data type DataCater data type
BIGINT Long
BINARY String
BIT Boolean
BLOB String
BOOL Boolean
BOOLEAN Boolean
CHAR String
DATE Date
DATETIME Timestamp
DECIMAL Double
DEC Double
DOUBLE Double
DOUBLE PRECISION Double
ENUM String
FIXED Double
FLOAT Float
GEOMETRY String
INT Int
INTEGER Int
JSON String
LINESTRING String
LONGBLOB String
LONGTEXT String
MEDIUMBLOB String
MEDIUMINT Int
MEDIUMTEXT String
NUMERIC Double
POINT String
POLYGON String
REAL Double
SET String
SMALLINT Int
TEXT String
TIME Time
TIMESTAMP String
TINYBLOB String
TINYINT Int
TINYTEXT String
VARBINARY String
VARCHAR String
YEAR Int