MongoDB Incremental Configuration
The Incremental Read feature in Advanced Mongo helps you fetch data in a way that suits your needs, whether you want to read data at once, pick up where you left off, or track changes over time.
Read Type
Option to fetch data from the source. Full Load, Incremental, and CDC options are available to fetch data as explained below:
Full Load
Reads all the records as per the configured collection from the database during the pipeline execution.
Incremental
Reads records as per specified offset or start value from the database during pipeline execution.
Provide the below details:
Column
Select the column for incremental read. The listed columns can be integer, long, date, timestamp, decimal, etc.
Column Includes Date
Check-mark the checkbox if the column to be read is of date/stamp type.
Offset
An offset value needs to be set for the incremental read that will be done on the selected column. Only those column records with values greater than the offset value will be read.
The given value should be in UTC with ISO Date format as yyyy-MM-dd\'T\'HH:mm:ss.SSSZZZ
.
Example: 2021-12-24T13:20:54.825+0000
Read Control Type
Provides three options to control data fetched - None, Limit by Count, and Limit by Value.
None: All the records in the reference column with values greater than the Start Value will be read.
Limit by Count: Only the mentioned number of records in the reference column with their values greater than the Start Value will be read. Provide No. of Records.
Limit by Value: All the records in the reference column with values greater than the Start Value but less than/equal to the Column Value that you set will be read (Column Value is inclusive). Set the Column Value.
For None and Limit by Count, it is recommended that the table should have data in sequential and sorted (increasing) order.
CDC
Reads the records from the configured namespace or the Oplog namespace as per the specified CDC configuration during the pipeline execution.
Oplog Database
Select the Oplog Database from where the data should be read.
Oplog Collection
Select the Oplog Collection from where the data should be read.
Load From Original Collection
This option is checked by default during the first time of the configuration.
If the checkbox is unchecked, then it reads the records from Oplog collection as per the specified CDC configuration during the pipeline execution.
It will get automatically disabled after the pipeline is successfully executed.
If this option is unchecked, then provide the below field:
Offset
Records with a timestamp value greater than the specified datetime (in UTC) will be fetched.
After each pipeline run, the datetime configuration will set to the most recent timestamp value from the last fetched records.
The given value should be in UTC with ISO Date format as yyyy-MM-dd'T'HH:mm:ss.SSSZZZ
.
Example: 2021-12-24T13:20:54.825+0000
.
If you have any feedback on Gathr documentation, please email us!