Advanced Mongo Data Source
In this article
Add an Advanced Mongo data source into your pipeline. Drag the data source to the canvas and click on it to configure.
Under the Schema Type tab, select Fetch From Source or Upload Data File. Edit the schema if required and click next to Configure the Advanced Mongo source.
Configuring Advanced Mongo
Field | Description |
---|---|
Connection Name | Connections are the Service identifiers. Select the connection name from the available list of connections, from where you would like to read the data. |
Database Name | Select the Mongo DB database source from which the data is to be fetched. |
Collection Name | Name of the database collection that needs to be scanned should be selected. |
Query | Filtering criteria option to choose between All Data or Match Query. If Match Query option is selected, provide the below detail: |
Filter/Query | Fetch filtered schema or read data from source as per the filter condition or query provided. Note: In Match Query, use only single quotes (‘’) where required. |
Records Per Partition | Number of records to be read per partition. |
Schema Updated Alert | Check the checkbox to receive alerts for any schema changes when the data is fetched from the source. |
Add Configuration | Option to add further configuration. |
Incremental Read in Advanced Mongo
Field | Description |
---|---|
Read Type | Option to fetch data from the source. Full Load, Incremental and CDC options are available to fetch data as explained below: |
Full Load | Reads all the records as per the configured collection from the database during the pipeline execution. |
Incremental | Reads records as per specified offset(s) or start value from the database during pipeline execution. Provide the below |
Column | Select the column for incremental read. The listed columns can be integer, long, date, timestamp, decimal, etc. The selected column should have sequential, sorted (in increasing order) and unique values. |
Column Includes Date | Check-mark the check-box if the column to be read is of data/stamp type. |
Start Value | An offset value needs to be set for the incremental read that will be done on the selected column. Only those column records with values greater than the offset value will be read. |
Read Control Type | Provides three options to control data fetched - None, Limit by Count, and Limit by Value. None: All the records in the reference column with values greater than the Start Value will be read. Limit by Count: Only the mentioned number of records in the reference column with their values greater than the Start Value will be read. Provide No. of Records. Limit by Value: All the records in the reference column with values greater than the Start Value but less than/equal to the Column Value that you set will be read (Column Value is inclusive). Set the Column Value. For None and Limit by Count, it is recommended that the table should have data in sequential and sorted (increasing) order. |
CDC | Reads the records from the configured namespace or the Oplog namespace as per the specified CDC configuration during the pipeline execution. |
Oplog Database | Select the Oplog Database from where the data should be read. |
Oplog Collection | Select the Oplog Collection from where the data should be read. |
Load From Original Collection | This option is checked by default during first time of the configuration. If the check box is unchecked, then it reads the records from Oplog collection as per the specified CDC configuration during the pipeline execution. It will get automatically disabled after pipeline is successfully executed. If this option is unchecked, then provide the below field: |
Offset | Records with timestamp value greater than the specified datetime (in UTC) will be fetched. After each pipeline run the datetime configuration will set to the most recent timestamp value from the last fetched records. The given value should be in UTC with ISO Date format as yyyy-MM-dd’T’HH:mm:ss.SSSZZZ. Ex: 2021-12-24T13:20:54.825+0000. |
Configure Pre-Action in Source →
If you have any feedback on Gathr documentation, please email us!