GCS ETL Target

Gathr provides GCS (Google Cloud Storage) emitter.

Target Configuration

Save as Data Asset: Select checkbox to save the schema as a data asset in Gathr.

Data Asset Name: Provide a name for the data asset to be saved.

Connection Name: Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for GCS earlier. Or create one as explained in the topic - GCS Connection →

Bucket Name: Provide path of the file for Google storage bucket name.

Path: Provide the sub-directories of the bucket name mentioned above to which the data is to be written.

Output Type: Select the output format in which the results will be processed.

Delimiter: Select the message field separator if the output type is selected as Delimited.

Output Fields: Select the fields in the message that needs to be a part of the output data.

Partitioning Required: To partition the data, check-mark the box and select the fields in Partition Columns parameter on which the data will be partitioned.

Save Mode: Save Mode is used to specify the expected behavior of saving data to a data sink.

ErrorifExist: When persisting data, if the data already exists, an exception is expected to be thrown.

Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data.

Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the Data.

Ignore: When persisting data, if data/table already exists, the save operation is expected to not save the contents of the Data and to not change the existing data.

This is similar to a CREATE TABLE IF NOT EXISTS in SQL.

Post Action

To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Post-Actions →

Notes

Optionally, enter notes in the Notes → tab and save the configuration.

Top