JDBC Data Source

JDBC Custom Channel supports Clickhouse, ES and other databases that have JDBC connection. JDBC Channel supports Oracle, Postgres, MYSQL, MSSQL, DB2 connections. You can configure and test the connection with JDBC. This connection will allow you to extract the data from Relational DB sources into pipeline in batches after configuring JDBC channel.

Prerequisite: Upload the appropriate driver jar as per the RDBMS used in JDBC Data Source. Use the upload jar option.

For using DB2, create a successful DB2 Connection.

Configuring JDBC Data Source

To add a JDBC into your pipeline, drag the Data Source to the canvas and click on it to configure.

Under the Schema Type tab, select Fetch From Source, or Upload Data File, or use an existing dataset.

👉

This is a Batch component.

Field	Description
Connection Name	Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for JDBC Custom earlier.

Use the Test Connection option to ensure that the connection with the JDBC Custom channel is established successfully.

A success message states that the connection is available. In case of any error in test connection, edit the connection to resolve the issue before proceeding further.


Override Credentials	Check this option to override credentials for user specific options.
Username	Provide username that has access to the database.
Password	Provide database user password.
Schema Name	Source Schema name for which the list of table will be viewed. This field will only be visible for databases that require schema to be selected.
Table Name	Source table name to be selected for which you want to view the metadata.
Query	Hive compatible SQL query to be executed in the component. In case if a connection with Custom JDBC option is selected, this query will serve as a sub-query for the inspect query specified in the custom JDBC connection configuration.
Design Time Query	Query used to fetch limited records during Application design. Used only during schema detection and inspection. This field should be ignored in case of custom JDBC connection.
Enable Query Partitioning	This enables parallel reading of data from the table. It is disabled by default. Tables will be partitioned if this check-box is enabled.

If query partitioning is enabled, additional fields will be displayed as described below:


Type-in Partition Column	Select this option if the Partition Column list shown is empty or you do not see the required column in the list. Upon checking this option, the Data Type field will appear.
Partition on Column	This column will be used to partition the data. This has to be a numeric column, on which spark will perform partitioning to read data in parallel.
Data Type	In case if you have typed-in the partitioning column, you need to specify the data type of that column here.
Autodetect Bounds	In case if you have typed-in the partitioning column, you need to specify the data type of that column here.
No. of Partitions	No of threads that will be launched to partition the table while reading data.
Lower Bound	Value of the lower bound for partitioning column. This value will be used to decide the partition boundaries. The entire dataset will be distributed into multiple chunks depending on the values.
Upper Bound	Value of the upper bound for partitioning column. This value will be used to decide the partition boundaries. The entire dataset will be distributed into multiple chunks depending on the values.
Fetch Size	The fetch size determines the number of rows to be fetched per round trip. The default value is 1000.
Add configuration	Additional properties can be added using Add Configuration link.
Schema	Check the populated schema details.
Advanced Configuration	Optionally, you can enable incremental read.

Metadata

Select table. You can view the Metadata of the tables.

Field	Description
Table	Select table of which you want to view Metadata.
Column Name	Name of the column generated from the table.
Column Type	Type of the column, for example: Text, Int
Nullable	If the value of the column could be Nullable or not.

Once the Metadata is selected, Click Next and detect schema to generate the output with Sample Values. The next tab is Incremental Read.

Incremental Read

Enter the schema and select table. You can view the Metadata of the tables.

Field	Description
Enable Incremental Read	Check this check-box to enable incremental read support.
Column to Check	Select a column on which incremental read will work. Displays the list of columns that has integer, long, date, timestamp, decimal types of values.
Type-in Details	Check this option if column list shown is empty or you do not see the required column in the list.
Start Value	Mention a value of the reference column, only the records whose value of the reference column is greater than this value will be read.
Override Query	Select this option to override the Incremental Read query. This allows you to provide an optimal query.
Incremental Read Query	This section shows the query to fetch the incremental data. If Override Query is selected, you can change this query and provide an optimal query. keyword is a placeholder which will be replaced with an actual value at runtime.
Read Control Type:	👉 Not available in case of Custom JDBC source. Provides three options to control data to be fetched -None, Limit By Count, and Maximum Value. None: All the records with value of reference column greater than offset will be read. Limit By Count: Mentioned no. of records will be read with the value of reference column greater than offset will be read. Maximum Value: All the records with value of reference column greater than offset and less than Column Value field will be read. For None and Limit by count it is recommended that table should have data in sequential and sorted (increasing) order.

Click Done to save the configuration.

Configure Pre-Action in Source →

If you have any feedback on Gathr documentation, please email us!

JDBC Data Source

Configuring JDBC Data Source #

Metadata #

Incremental Read #

Configuring JDBC Data Source

Metadata

Incremental Read