Solr ETL Target
- Target Configuration
- Post Action
- Notes
In this article
- Target Configuration
- Post Action
- Notes
Target Configuration
Configure the target parameters that are explained below.
Connection Name
Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Solr earlier. Or create one as explained in the topic - Solr Connection →
Batch Size
If user wants to index records in batch, for that the user has to specify batch size.
Ignore Missing Values
Ignore or persist empty or null values of message fields in sink.
Across Field Search Enabled
Specifies if full text search is to be enabled across all fields.
Index Number of Shards
Specifies number of shards to be created in index store.
Index Replication Factor
Specifies number of additional copies of data to be kept across nodes. Should be less than n-1, where n is the number of nodes in the cluster.
Index Name
Name of the Solr index or collection where the data will be stored.
jsexpression is used to evaluate index name.
For example: ’ns_Name’, the index will be created as ’ns_Name’.
Use field alias instead of field name in expression when you want to perform field based partitioning.
Routing Required
This specifies if custom dynamic routing is to be enabled. If enabled, a json of routing policy needs to be defined.
Routing Policy
A json defining the custom routing policy.
Example:
{“1”:{“company”:{“Google”:20.2,“Apple”:80.0}}}
Here, 1 is the timestamp after which the custom routing policy will be active, ‘company’ is the field name and the value ‘Google’ takes 20% shards and value ‘Apple’ takes 80% shards.
ID Generator Type
Enables to generate the ID field.
Following types of ID generators are available:
UUID: Universally unique identifier.
Key Based:
- Key Fields: Select message field to be used as key.
Note: Add key ‘incremental_fields’ and comma separated column names as values. This will work with a key based UUID
Emitter Output Fields
Fields of the output message.
Connection Retries
Number of retries for component connection. Possible values are -1, 0 or positive number. -1 denotes infinite retries.
Delay Between Connection Retries
Defines the retry delay intervals for component connection in milliseconds.
Enable TTL
Select TTL that limits the lifetime of the data.
TTL Type
Select TTL type as either Static or Field Value.
TTL Value
Provide TTL value in seconds in case of static TTL type or integer field in case of Field Value.
Additional configuration parameters that appear in case of a Streaming Data Source.
Output Mode
Output mode to be used while writing the data to Streaming sink.
Append: Output Mode in which only the new rows in the streaming data will be written to the sink.
Enable Trigger
Trigger defines how frequently a streaming query should be executed.
Trigger Type
Supported Trigger Types are:
One-Time Micro-Batch: A trigger that process only one batch of data in a streaming query then terminates the query.
Fixed Interval Micro-Batches: A trigger policy that runs a query periodically based on an interval in processing time.
Processing Time
Specifies the time interval that governs the trigger policy.
Add Configuration: To add additional custom ADLS properties in a key-value pair.
Post Action
To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Post-Actions →
Notes
Optionally, enter notes in the Notes → tab and save the configuration.
If you have any feedback on Gathr documentation, please email us!