Kinesis ETL Source
In this article
Kinesis Data Source allows you to fetch data from Amazon Kinesis stream.
Schema Type
See the topic Provide Schema for ETL Source → to know how schema details can be provided for data sources.
After providing schema type details, the next step is to configure the data source.
Data Source Configuration
Connection Name: Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Kinesis earlier. Or create one as explained in the topic - Kinesis Connection →
Application Name: Name of the application for check pointing.
Stream Name: Name of the Kinesis stream.
Shards Count: Number of shards required to create the stream, if the stream is not already present.
EndPoint: End point is a URL that is the entry point of Kinesis services.
Region: Name of the region. For example, us-west-2
Initial Position: In the absence of Kinesis checkpoint, this is the workers initial starting position in the stream.
The values are either the beginning of stream per Kinesis limit of 24 hours or the tip of the stream.
TRIM_HORIZON: To read data from beginning of stream, use the command TRIM_HORIZON.
Latest: To read latest or most recent records, use the LATEST option.
Checkpoint Interval: Checkpoint interval for Kinesis check pointing.
This allows the system to recover from failures and continue processing where the Stream left off.
Storage Level: Flag for controlling the storage.
MEMORY_ONLY
MEMORY_AND_DISK
MEMORY_ONLY_SER
MEMORY_AND_DISK_SER
MEMORY_ONLY_2
MEMORY_AND_DISK_2
DISK_ONLY
MEMORY_ONLY__SER_2
DISK_ONLY_2
MEMORY_AND_DISK_SER_2
Add configuration: Additional properties can be added using Add Configuration link.
Detect Schema
Check the populated schema details. For more details, see Schema Preview →
Notes
Optionally, enter notes in the Notes → tab and save the configuration.
If you have any feedback on Gathr documentation, please email us!