GCS (Batch and Streaming) Data Source

Gathr provides batch and streaming GCS (Google Cloud Storage) channels.

On the GCS channel, you will be able to read data with formats including JSON, CSV, TEXT, XML, Fixed Length, Binary, Parquet, ORC.

The configuration for GCS data source is specified below:

FieldDescription
Connection NameSelect GCP connection name for establishing connection.
Override CredentialsCheck the option for user specific actions.
Service Account Key FileGCP service account key file to create connection.
Bucket NameProvide path of the file for Google storage bucket name.
PathProvide value for the end path with * in case of directory. For e.g. outdir.*
File FilterProvide a file pattern. File filter is used to only include files with file names matching the pattern. For e.g *.pdf or *emp *.csv
Recursive File LookupCheck the option to retrieve the files from current/sub-folder(s).
  • The user can add configuration by clicking at the ADD CONFIGURATION button.

  • Next, in the Detect Schema window, the user can set the schema as dataset by clicking on the Save As Dataset checkbox.

  • The Incremental Read option will be in GCS batch data source and not in the GCS Streaming channel.

Configure Pre-Action in Source →

Top