Turnpike Processor

Turnpike is used with streaming dataset to utilize the benefits of batch transformations in streaming pipeline. User would also be able to perform sequential and priority-based execution of flows (Processors and Emitters).

The configuration details are as under:

FieldDescription
Output ModeOutput mode to be used while writing the data to Streaming emitter. Select the output mode from the given three options:
Append: Output mode in which only the new rows in the streaming data will be written to the sink
Complete Mode: Output Mode in which all the rows in the streaming data will be written to the sink every time there are some updates.
Update Mode: Output Mode in which only the rows that were updated in the streaming data will be written to the sink every time there are some updates.
Checkpoint Storage LocationSelect the checkpointing storage location. The available options are HDFS, S3.
Checkpoint ConnectionsSelect the connection. Connections are listed corresponding to selected storage location.
Override CredentialsOption to override credentials for user specific actions.
If the Check Point Storage Location is selected as S3, then provide the AWS Key ID (S3 account key access) and Secret access key.
KeyTab Select OptionSelect Option for Keytab.
The available options are:

- Specify KeyTab File Path

- Upload KeyTab File

Check Point DirectoryIt is the HDFS path where the Spark application stores the checkpoint data.
Time-based Check PointSelect checkbox to enable time-based checkpoint on each pipeline run i.e. in each pipeline run above provided checkpoint location will be appended with current time in millis.
Enable TriggerTrigger defines how frequently a streaming query should be executed.

Click the +ADD CONFIGURATION button to add further configurations in key value pair.

Top