CDC Data Source

The CDC Data Source processes Change Data Capture (CDC) information provided by Oracle LogMiner redo logs from Oracle 11g or 12c.

CDC Data Source processes data based on the commit number, in ascending order. To read the redo logs, CDC Data Source requires the LogMiner dictionary.

Follow the Oracle CDC Client Prerequisites, before configuring the CDC Data Source.

Oracle CDC Client Prerequisites

Before using the Oracle CDC Client Data Source, complete the following tasks:

  1. ​Enable LogMiner.

  2. ​Enable supplemental logging for the database or tables.

  3. ​Create a user account with the required roles and privileges.

  4. ​To use the dictionary in redo logs, extract the Log Miner dictionary.

  5. ​Install the Oracle JDBC driver.

Task 1. Enable LogMiner

LogMiner provides redo logs that summarize database activity. The Data Source uses these logs to generate records.

LogMiner requires an open database in ARCHIVELOG mode with archiving enabled. To determine the status of the database and enable LogMiner, use the following steps:

  1. Log into the database as a user with DBA privileges.

  2. Check the database logging mode:

    select log_mode from v$database;
    

    If the command returns ARCHIVELOG, you can skip to Task 2.

    If the command returns NOARCHIVELOG, continue with the following steps:

  3. Shut down the database.

    shutdown immediate;
    
  4. Start up and mount the database:

    startup mount;
    
  5. Configure enable archiving and open the database:

    alter database archivelog;
    
    alter database open;
    

Task 2. Enable Supplemental Logging

To retrieve data from redo logs, LogMiner requires supplemental logging for the database or tables.

  1. ​To verify if supplemental logging is enabled for the database, run the following command:

    SELECT supplemental_log_data_min, supplemental_log_data_pk, supplemental_log_data_all FROM v$database;
    

    For 12c multi-tenant databases, best practice is to enable logging for the container for the tables, rather than the entire database. You can use the following command first to apply the changes to just the container:

    ALTER SESSION SET CONTAINER=<pdb>;
    

    You can enable identification key or full supplemental logging to retrieve data from redo logs. You do not need to enable both:

    To enable identification key logging

    You can enable identification key logging for individual tables or all tables in the database:

    • For individual tables

    Use the following commands to enable minimal supplemental logging for the database, and then enable identification key logging for each table that you want to use:

    ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
    ALTER TABLE <schema name>.<table name> ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
    
    • For all tables

    Use the following command to enable identification key logging for the entire database:

    To enable full supplemental logging

    You can enable full supplemental logging for individual tables or all tables in the database:

    • For individual tables

    Use the following commands to enable minimal supplemental logging for the database, and then enable full supplemental logging for each table that you want to use:

    ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
    ALTER TABLE <schema name>.<table name> ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
    
    • For all tables

    Use the following command to enable full supplemental logging for the entire database:

    ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
    

    To submit the changes

    ALTER SYSTEM SWITCH LOGFILE;
    

Task 3. Create a User Account

Create a user account to use with the Oracle CDC Client Data Source. You need the account to access the database through JDBC.

Create accounts differently based on the Oracle version that you use:

Oracle 12c multi-tenant databases

For multi-tenant Oracle 12c databases, create a common user account. Common user accounts are created in cdb$root and must use the convention: c##<name>.

  1. ​Log into the database as a user with DBA privileges.

  2. ​Create the common user account:

    ALTER SESSION SET CONTAINER=cdb$root;
    
    CREATE USER <user name> IDENTIFIED BY <password> CONTAINER=all;
    
    GRANT create session, alter session, set container, select any dictionary, logmining, execute_catalog_role TO <username> CONTAINER=all;
    
    ALTER SESSION SET CONTAINER=<pdb>;
    
    GRANT select on <db>.<table> TO <user name>
    

    Repeat the final command for each table that you want to use.

    When you configure the origin, use this user account for the JDBC credentials. Use the entire user name, including the “c##”, as the JDBC user name.

Oracle 12c standard databases

For standard Oracle 12c databases, create a user account with the necessary privileges:

  1. ​Log into the database as a user with DBA privileges.

  2. ​Create the user account:

    CREATE USER <user name> IDENTIFIED BY <password>;
    
    GRANT create session, alter session, select any dictionary, logmining, execute_catalog_role TO <user name>;
    
    GRANT select on <db>.<table> TO <user name>;
    

    Repeat the last command for each table that you want to use.

    When you configure the Data Source, use this user account for the JDBC credentials.

Oracle 11g databases

For Oracle 11g databases, create a user account with the necessary privileges:

  1. Log into the database as a user with DBA privileges.

  2. Create the user account:

    CREATE USER <user name> IDENTIFIED BY <password>;
    
    GRANT create session, alter session, select any dictionary, logmining, execute_catalog_role TO <user name>;
    GRANT select on <db>.<table> TO <user name>;
    
    GRANT select on v$logmnr\_parameters to <user name>;
    
    GRANT select on v$archived\_log to <user name>;
    
    GRANT select on <db>.<table> TO <user name>;
    

    Repeat the final command for each table that you want to use.

    When you configure the origin, use this user account for the JDBC credentials.

Task 4. Extract a Log Miner Dictionary (Redo Logs)

When using redo logs as the dictionary source, you must extract the Log Miner dictionary to the redo logs before you start the pipeline. Repeat this step periodically to ensure that the redo logs that contain the dictionary are still available.

Oracle recommends that you extract the dictionary only at off-peak hours since the extraction can consume database resources.

To extract the dictionary for Oracle 11g or 12c databases, run the following command:

EXECUTE DBMS_LOGMNR_D.BUILD(OPTIONS=> DBMS_LOGMNR_D.STORE_IN_REDO_LOGS);

To extract the dictionary for Oracle 12c multi-tenant databases, run the following commands:

ALTER SESSION SET CONTAINER=cdb$root;

EXECUTE DBMS_LOGMNR_D.BUILD(OPTIONS=> DBMS_LOGMNR_D.STORE_IN_REDO_LOGS);

Task 5. Install the Driver

The Oracle CDC Client origin connects to Oracle through JDBC. You cannot access the database until you install the required driver.

The CDC Data Source processes Change Data Capture (CDC) information provided by Oracle LogMiner redo logs from Oracle 11g or 12c.

CDC Data Source processes data based on the commit number, in ascending order. To read the redo logs, CDC Data Source requires the LogMiner dictionary.

The Data Source can create records for the INSERT, UPDATE and DELETE operations for one or more tables in a database

Configuring CDC Data Source

To add a CDC Data Source into your pipeline, drag the processor to the canvas and right click on it to configure.

Source Tables:

To configure your schema, you can either use configured pluggable database or non-pluggable database.

FieldDescription
Use Pluggable DatabaseSelect the option for 12c multi-tenant databases.
Pluggable Database NameName of the pluggable database that contains the schema you want to use. Use only when the schema was created in a pluggable database.
Container Database ConnectionUse the user account created for the Data Source. Common user accounts for Oracle.
Schema NameProvide the schema name.
TablesTables on which data will be processed.
FieldDescription
Operations

Operations for creating records.

• INSERT

• DELETE

• UPDATE

Database Time ZoneTime zone of the database. When the database operates in a different time zone from Data Collector.
Maximum Transaction TimeTime in seconds to wait for changes for a transaction. Enter the longest period that you expect a transaction to require. Default is 60 seconds.
LogMiner Session WindowTime in seconds to keep a LogMiner session open. Set to larger than the maximum transaction length. Reduce when not using local buffering to reduce LogMiner resource use. Default is 7200 seconds.
Local Buffering

TRUE:-

Using local buffers, the Data Source requests the transactions for the relevant tables and period. The Data Source buffers the resulting LogMiner redo SQL statements until it verifies a commit for a transaction. After the commit, it parses and processes the committed data. The Data Source can buffer the redo SQL statements completely in memory.

FALSE:-

When using Oracle LogMiner buffers, the Data Source requests data from Oracle LogMiner for a particular period. LogMiner then buffers all transactions for that period for all the tables in the database, rather than only the tables needed by the origin.

Max Batch Size (records)Maximum number of records processed at one time. Keep values up to the Data Collector maximum batch size. Default is 1000.
LogMiner Fetch SizeMinimum number of records to fetch before passing a batch to the pipeline. Keep this value low to allow the origin to pass records to the pipeline as soon as possible, rather than waiting for a larger number of records to become available. Lower values can increase throughput when writes to the destination system are slow. Default is 1.
Query TimeoutTime to wait before timing out a LogMiner query. Default is 300 seconds.
Start From

Start From is the point in the LogMiner redo logs where you want to start processing from. When you start the pipeline, Oracle CDC Client starts processing from the specified initial change and continues until you stop the pipeline. It is sub- categories in the following:

1. From the latest change

The origin processes all changes that happened in the pipeline after you start the pipeline.

2. From a specified date-time:

The origin processes all changes that occurred at the specified date-time and later use the format Use the format: DD-MM-YYYY HH24:mm:ss.

3. From a specified system-change number (SCN)

The origin processes all changes that occurred in the specified SCN (specified-change number) and later. When using the specified SCN, the origin starts processing with the timestamps associated with the SCN. If the SCN cannot be found in the redo logs, the origin continues reading from the next higher SCN that is available in the redo logs.

Top