S3 Batch ETL Source

On a S3 Batch Channel you will be able to read data from specified S3 Bucket.

Schema Type

See the topic Provide Schema for ETL Source → to know how schema details can be provided for data sources.

After providing schema type details, the next step is to configure the data source.

👉

When configuring S3 batch channels, you have the option to read Binary File data types in base64Encoded format. You can achieve this by either uploading a binary file or fetching it from the source.

Data Source Configuration

Configure the data source parameters that are explained below.

Connection Name

Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Amazon S3 earlier. Or create one as explained in the topic - Amazon S3 Connection →

Bucket Name

Buckets are storage units used to store objects, which consists of data and meta-data that describes the data.

Path

File or directory path from where data is to be read. The path must end with * in case of directory.

Example: inputdir/*

💡

In case of incremental read, the exact directory path should be provided.

Add Configuration: To add additional custom S3 properties in a key-value pair.

Detect Schema

Check the populated schema details. For more details, see Schema Preview →

Incremental Read

Optionally, you can enable incremental read. For more details, see Amazon S3 Incremental Configuration →).

Pre Action

To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Pre-Actions →).

Notes

Optionally, enter notes in the Notes → tab and save the configuration.

If you have any feedback on Gathr documentation, please email us!

S3 Batch ETL Source

Schema Type #

Data Source Configuration #

Connection Name #

Bucket Name #

Path #

Detect Schema #

Incremental Read #

Pre Action #

Notes #