Enable CDC for MYSQL
Before the Debezium MySQL connector can be used to monitor the changes committed on a MySQL server, the server must be set up to use row-level binary logging and have a database user with appropriate privileges.
Enable MYSQL Bin Log
Stop the mysqld service
Modify the /etc/my.cnf file with following parameters to enable MySQL bin logging:
Save the changes by clicking on Save.
log-bin=mysql-bin
server-id=1
binlog\_format=row
expire\_logs\_days=10
Start the mysqld service after above changes are done.
For test purpose, we have created the database “test” with table as “books”. Grant all permission to the user.
GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON \*.\* TO 'test' IDENTIFIED BY ‘password';
Connector Setup/Download
To read the detailed documentation of Debezium MySQL jars, click here.
Make sure Zookeeper, Kafka, and Kafka Connector are already installed.
Download the connector’s plug-in archive, extract the JARs into your Kafka Connect environment, and add the directory with the JARs to Kafka Connect’s classpath.
Restart your Kafka Connect process to pick up the new JARs.
An example of installation on HDP 3.1.0 is shown below:
Check the compatibility of Debezium MYSQL CDC packages with the installed Kafka version.
Extract the package to a directory and then copy the Debezium jar files under:
/usr/hdp/3.1.0.0-78/kafka/libs
antlr4-runtime-4.8.jar
debezium-api-1.9.4.Final.jar
debezium-connector-mysql-1.9.4.Final.jar
debezium-core-1.9.4.Final.jar
debezium-ddl-parser-1.9.4.Final.jar
failureaccess-1.0.1.jar
guava-30.1.1-jre.jar
mysql-binlog-connector-java-0.25.6.jar
mysql-connector-java-8.0.28.jar
Connector Configuration Files
Configure the connect-standalone.properties or connect-distributed.properties depending on the cluster setup.
Make the below settings in /usr/hdp/3.1.0.0-78/kafka /config
plugin.path=/usr/hdp/3.1.0.0-78/kafka/libs
rest.port=6669
You also need to create a new connector properties file, which will have all the information related to the MySQL server database configuration. There are more metrices available which can be enabled as per the requirements.
Below are minimum required details:
Example: /usr/hdp/3.1.0.0-78/kafka /config/connector.properties
name=<Mention any Name for connector>
connector.class=io.debezium.connector.mysql.MySqlConnector
database.user=<name of the MySQL user that has the required privileges>
database.history.kafka.bootstrap.servers= <List of Kafka brokers that this connector:PORT>
database.history.kafka.topic= <name of the DB history topic>
database.server.name=< provide any logical name of the MySQL server/cluster>
database.port=<MySQL DB port ie.3306>
database.hostname=< The address of the MySQL server.>
database.password=< The password for the MySQL user>
database.whitelist=< list of all databases hosted by this server that this connector will monitor. This is optional>
Start the Connector
Once all the previous settings are done correctly, start the Debezium Connector.
Use the below command to start:
nohup /usr/hdp/3.1.0.0-78/kafka/bin/connect-standalone.sh config/connect-standalone.properties config/connector.properties &
You can further monitor the nohup.log for its start-up or ERRORS while starting the connector.
Once the connector is started successfully, it is now ready to record the data changes.
MYSQL Connector Test
The MySQL connector writes events for all insert, update, and delete operations on a single table to a single Kafka topic.
The name of the Kafka topics would be serverName.databaseName.tableName, where:
serverName is the logical name of the connector as specified with the database.server.name configuration property.
databaseName is the name of the database where the operation occurred.
tableName is the name of the database table on which the operation occurred.
For verification, make any changes into the MySQL DB, the changes should be reflected in the Kafka topic.
bin/kafka-console-consumer.sh --bootstrap-server <kafka bootstrapserver>:6667 --topic test.test.books
Setup in Kerberos and SSL Environment
For Kerberos environment, below needs to be set in connect-standalone.properties
# Kerberos</p><p>sasl.mechanism=GSSAPI
producer.security.protocol=SASL\_PLAINTEXT
Also, update the connector properties file, which has all the information related to the MYSQL server database configuration. See Connector Configuration Files.
database.history.producer.security.protocol=SASL\_PLAINTEXT
database.history.consumer.security.protocol=SASL\_PLAINTEXT
For Kerberos and SSL enabled environment, below needs to be set in connect-standalone.properties
# Kerberos
sasl.mechanism=GSSAPI
producer.security.protocol=SASL\_SSL
Also, update the connector properties file, which has all the information related to the MYSQL server database configuration. See Connector Configuration Files.
database.history.producer.security.protocol=SASL\_SSL
database.history.consumer.security.protocol=SASL\_SSL
If you have any feedback on Gathr documentation, please email us!