JDBC Emitter

JDBC Emitter allows you to push data to relational databases like MySQL, PostgreSQL, Oracle DB and MS-SQL.

JDBC emitter also enables you to configure data on DB2 database using JDBC emitter for both batch and stream.

It is enriched with lookup functionality for the DB2 database so that you can enrich fields with external data read from DB2. Select a DB2 connection while configuring JDBC emitter.

JDBC Emitter Configuration

To add a JDBC emitter to your pipeline, drag the JDBC emitter onto the canvas and connect it to a Data Source or processor.

👉

If the data source in pipeline has a streaming component, then the emitter will show four additional properties: Checkpoint Storage Location, Checkpoint Connections, Checkpoint Directory, and Time-Based checkpoint.

The configuration settings of the JDBC emitter are as follows:

Field	Description
Save as Dataset	Save the JDBC emitter configuration as a Dataset.
Dataset Name	Provide a unique name to the Dataset.
Connection Name	All JDBC connections will be listed here. Select a connection for connecting to JDBC.
Message Name	Name of the message configuration that acts as metadata.
Schema Name	Existing database Schema Names whose tables are fetched. (for MSSQL, DB2 and POSTGRES)
Table Name	Existing tablename of the specified database.
Is Batch Enable	Enable parameter to batch multiple messages and improve write performances.
Batch Size	Batch Size, which determines how many rows to insert per round trip. This can help the performance on JDBC drivers. This option applies only to writing. It defaults to 1000.
Connection Retries	Number of retries for component connection.
Delay Between Connection Retries	Defines the retry delay intervals for component connection in millis.
Save Mode	Save Mode is used to specify the expected behavior of saving data to a data sink. Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data. Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the Data. Ignore: When persisting data, if data/table already exists, the save operation is expected to not save the contents of the Data and to not change the existing data. This is similar to a CREATE TABLE IF NOT EXISTS in SQL.
Check Point Directory	It is the HDFS Path where the Spark application stores the checkpoint data.
Output Mode	Output Mode is used to what data will be written to a streaming sink when there is new data available. Select the output mode from the given three options: Append: Output Mode in which only the new rows in the streaming data will be written to the sink Complete Mode: Output Mode in which all the rows in the streaming data will be written to the sink every time there are some updates. Update Mode: Output Mode in which only the rows that were updated in the streaming data will be written to the sink every time there are some updates.
Enable Trigger	Trigger defines how frequently a streaming query should be executed.
Processing Time	It will appear only when Enable Trigger checkbox is selected. Processing Time is the trigger time interval in minutes or seconds.
ADD CONFIGURATION	Enables to configure custom properties. Provide spark-mssql-connector configuration for JDBC Emitter. Provide the below configurations: sqlConnectorName: `com.microsoft.sqlserver.jdbc.spark` tableLock: `false` schemaCheckEnabled: `false` 👉 If the above configuration is not provided then the emitter will use plain JDBC for writing data.
Schema Results:	Map the values coming in pipeline with any table columns name.
Table Column Name	The columns from the selected table are populated here.
Mapping Value	Enter a mapping value to the corresponding column.
Database Data Type	The data type of the value, i.e., String, Int, Text, and so on.
Ignore All	Use Ignore All or selected fields while pushing data to emitter.

👉

With the upgraded **Spark3**, complex data types like array, map, set, having **empty** values will be stored/read as **null**. Both inspection and execution of JDBC/CSV will contain NULL value in case of missing/empty values. In case of String, user can manually use **expression evaluator** to change **null** value into **empty** string same as Spark2 behavior.

Sample Gathr Expression Evaluator:

(_replaceMissingValues(<Col Name>, '', 'replaceNullAndEmpty', string))

Click on the NEXT button.

Enter the notes in the space provided.

Click DONE for saving the configuration.

If you have any feedback on Gathr documentation, please email us!

JDBC Emitter

JDBC Emitter Configuration #

JDBC Emitter Configuration