Amazon S3 Ingestion Source

Amazon S3 data source reads the objects from the Amazon S3 buckets.

Data Source Configuration

Configure the data source parameters that are explained below.

Fetch From Source/Upload Data File

For designing the application, you can either fetch the sample data from the Amazon S3 source by providing the data source connection details or upload a sample data file in one of the supported formats to see the schema details during the application design.

Upload Data File

If Upload Data File is selected to fetch sample data, provide the below details.

File Format

Select the sample file format (file type) depending on the data type.

Gathr-supported file formats for Amazon S3 data source are CSV, JSON, TEXT, Parquet, ORC and AVRO.

For CSV file format, select its corresponding delimiter.

Header Included

Enable this option to read the first row as a header if your Amazon S3 data is in CSV format.

Upload

Please upload the sample file as per the file format selected above.


Fetch From Source

If Fetch From Source is selected, continue configuring the data source.


Connection Name

Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Amazon S3 earlier. Or create one as explained in the topic - Amazon S3 Connection →

Use the Test Connection option to ensure that the connection with the Amazon S3 channel is established successfully.

A success message states that the connection is available. In case of any error in test connection, edit the connection to resolve the issue before proceeding further.


Bucket Name

Buckets are storage units used to store objects, which consists of data and meta-data that describes the data.


Path

File or directory path from where data is to be read. The path must end with * in case of directory.

Example: outdir/*


File Filter

Provide a file pattern example: *csv/*json to retrieve the available files.


Recursive File Lookup

Check the option to retrieve the files from current/sub-folder(s).


Add Configuration: Additional properties can be added using this option as key-value pairs.


Schema

Check the populated schema details. For more details, see Schema Preview →


Advanced Configuration

Optionally, you can enable incremental read. For more details, see Amazon S3 Incremental Configuration →.

Top