GCS Streaming ETL Source
Gathr provides GCS (Google Cloud Storage) Streaming channel.
Permissions Required in GCS
The user should have cloud storage permission to access and read data from Buckets of GCS Batch data source.
Schema Type
See the topic Provide Schema for ETL Source → to know how schema details can be provided for data sources.
Data Source Configuration
After providing schema type details, the next step is to configure the data source.
Connection Name
Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for GCS Batch earlier. Or create one as explained in the topic - GCS Connection →
Bucket Name
Provide path of the file for Google storage bucket name.
Path
Provide value for the end path with * in case of directory. Example: outdir*
File Filter
Provide a file pattern. Example: *csv/*json to retrieve the available files.
Recursive File Lookup
Check the option to retrieve the files from current/sub folder(s).
Detect Schema
Check the populated schema details. For more details, see Schema Preview →
Incremental Read
Optionally, you can enable incremental read. For more details, see GCS Incremental Configuration →
Pre Action
To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Pre-Actions →).
Notes
Optionally, enter notes in the Notes → tab and save the configuration.
If you have any feedback on Gathr documentation, please email us!