S3 Batch ETL Source
On a S3 Batch Channel you will be able to read data from specified S3 Bucket.
Schema Type
See the topic Provide Schema for ETL Source → to know how schema details can be provided for data sources.
After providing schema type details, the next step is to configure the data source.
Data Source Configuration
Configure the data source parameters that are explained below.
Connection Name
Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Amazon S3 earlier. Or create one as explained in the topic - Amazon S3 Connection →
Bucket Name
Buckets are storage units used to store objects, which consists of data and meta-data that describes the data.
Path
File or directory path from where data is to be read. The path must end with *
in case of directory.
Example: inputdir/*
File Filter
Provide a file pattern example: *csv/*json to retrieve the available files.
Recursive File Lookup
Check the option to retrieve the files from current/sub-folder(s).
Add Configuration: To add additional custom S3 properties in a key-value pair.
Detect Schema
Check the populated schema details. For more details, see Schema Preview →
Incremental Read
Optionally, you can enable incremental read. For more details, see Amazon S3 Incremental Configuration →).
Pre Action
To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Pre-Actions →).
Notes
Optionally, enter notes in the Notes → tab and save the configuration.
If you have any feedback on Gathr documentation, please email us!