Advanced Mongo Data Source

Add an Advanced Mongo data source into your pipeline. Drag the data source to the canvas and click on it to configure.

Under the Schema Type tab, select Fetch From Source or Upload Data File. Edit the schema if required and click next to Configure the Advanced Mongo source.

Configuring Advanced Mongo

FieldDescription
Connection Name

Connections are the Service identifiers.

Select the connection name from the available list of connections, from where you would like to read the data.

Database NameSelect the Mongo DB database source from which the data is to be fetched.
Collection NameName of the database collection that needs to be scanned should be selected.
Query

Filtering criteria option to choose between All Data or Match Query.

If Match Query option is selected, provide the below detail:

Filter/Query

Fetch filtered schema or read data from source as per the filter condition or query provided.

Note:

In Match Query, use only single quotes (‘’) where required.

Records Per PartitionNumber of records to be read per partition.
Schema Updated AlertCheck the checkbox to receive alerts for any schema changes when the data is fetched from the source.
Add ConfigurationOption to add further configuration.

Incremental Read in Advanced Mongo

FieldDescription
Read TypeOption to fetch data from the source. Full Load, Incremental and CDC options are available to fetch data as explained below:
Full LoadReads all the records as per the configured collection from the database during the pipeline execution.
IncrementalReads records as per specified offset(s) or start value from the database during pipeline execution. Provide the below
Column

Select the column for incremental read. The listed columns can be integer, long, date, timestamp, decimal, etc.

Column Includes DateCheck-mark the check-box if the column to be read is of data/stamp type.
Start Value

An offset value needs to be set for the incremental read that will be done on the selected column.

Only those column records with values greater than the offset value will be read.

Read Control Type

Provides three options to control data fetched - None, Limit by Count, and Limit by Value.

None: All the records in the reference column with values greater than the Start Value will be read.

Limit by Count: Only the mentioned number of records in the reference column with their values greater than the Start Value will be read.

Provide No. of Records.

Limit by Value: All the records in the reference column with

values greater than the Start Value but less than/equal to the Column Value that you set will be read (Column Value is inclusive).

Set the Column Value.

For None and Limit by Count, it is recommended that the table should have data in sequential and sorted (increasing) order.

CDCReads the records from the configured namespace or the Oplog namespace as per the specified CDC configuration during the pipeline execution.
Oplog DatabaseSelect the Oplog Database from where the data should be read.
Oplog CollectionSelect the Oplog Collection from where the data should be read.
Load From Original Collection

This option is checked by default during first time of the configuration. If the check box is unchecked, then it reads the records from Oplog collection as per the specified CDC configuration during the pipeline execution. It will get automatically disabled after pipeline is successfully executed.

If this option is unchecked, then provide the below field:

OffsetRecords with timestamp value greater than the specified datetime (in UTC) will be fetched. After each pipeline run the datetime configuration will set to the most recent timestamp value from the last fetched records. The given value should be in UTC with ISO Date format as yyyy-MM-dd’T’HH:mm:ss.SSSZZZ. Ex: 2021-12-24T13:20:54.825+0000.

Configure Pre-Action in Source →

Top