Snowflake Emitter

The user can use snowflake cloud-based data warehouse system as an emitter in the ETL pipelines. The user will be required to configure as shown below:

FieldDescription
Connection NameConnection name is to be selected out of the list of saved connections.
Override CredentialsUnchecked by default, check the checkbox to override credentials for user specific actions.
UsernameOnce the Override Credentials option is checked, provide the user name through which the Delta SQL service is running.
PasswordProvide the password for Delta SQL override credentials.
Warehouse NameRequired warehouse name is to be selected from the list of Warehouses for the selected Snowflake connection.
Schema NameRequired schema name is to be selected from the Snowflake database schema list which appears as per the selected Snowflake connection.

Table settings options: Use Existing Table or Create New Table are available.

If Use Existing Table is selected, then provide the below fields:

Table Name

Target tables that are contained in the schema that you selected before, will list as a drop-down. An existing table name should be selected in which the source data is to be emitted.

Next, the schema results will be displayed for mapping.

Schema Mapping when an Existing Table is used

There are two ways in which schema mapping can be provided for an existing table, Fetch From Target and Upload Schema File.

Fetch From Target

With fetch from target you can directly fetch the schema to be applied from the Snowflake target.

Table column names will get populated and the Auto Map option can be used to fill-in the mapping values.

To do bulk changes in the mapping values, you can use Download Mapping option to download the schema file. Update mapping values in the downloaded file. Then, use Upload Mapping option to provide the mapping values.

The columns that are selected to ignore will not be considered for mapping.

Upload Schema File

The upload schema file option is preferable when the application is supposed to be submitted on a registered cluster. Here, you can upload the sample schema to be applied on the Snowflake target table.

Use Upload Schema option to provide a sample schema file if you have one.

Then, the Auto Map option can be used to fill-in the mapping values.

Else, use Download Sample Schema option to download a sample schema file. Update mapping values in the downloaded file and then upload it to provide the schema mapping.

The columns that are selected to ignore will not be considered for mapping.

Create New TableIf Create New Table is selected, configure the below fields:
Table Name

A table name should be provided that will get created in the schema that you selected before. The source data will be emitted to this table.

Next, the Columns section will be displayed for mapping.

Use Upload Schema option to provide a sample schema file.

Then, the Auto Map option can be used to fill-in the mapping values.

Else, use Download Sample Schema option to download a sample schema file. Update mapping values in the downloaded file and then upload it to provide the schema mapping.

The columns that are selected to ignore will not be considered for mapping.

Each column can then be defined with its data type, and optionally whether the column:

  • Requires a value (NOT NULL).

  • Has a default value.

  • Has any referential integrity constraints (primary key, or unique key).

Additionally, you can also provide the constraints for reference table and reference columns using the Add Foreign Key option.

Continue to configure the new table options as follows:

CommentsOptional comments can be added for the table.
Cluster By

Specify column or column expressions that should be referenced as the clustering key in the table.

To know more about data clustering in Snowflake, refer to the topic, Snowflake Table Structures

Data Retention TimeFor how many days the table needs to be retained can be specified.
Change TrackingIf set to true, an individual table stream tracks the changes made to rows in a source table.
Copy GrantsCheck-marking the box for grants, copies permissions from the table being replaced with CREATE OR REPLACE (if it already exists), and not from the source table(s) being queried in the SELECT statement.

Table Options - Create table method should be selected out of the following:

Create table if not existsA new table will get created as per the configurations provided if it does not exist in the target.
Truncate & create tableIf the table name provided already exists in Snowflake database, it will be truncated and a new table will then be created with the source data.
Truncate tableIf the table name provided already exists in Snowflake database, it will be truncated with the source data.

Warehouse Settings Cluster and Scaling

Configure Warehouse details to specify the compute resources. The available options are Use Existing Warehouse or Create New Warehouse.

If Use Existing Warehouse is selected, configure the below fields:

FieldDescription
Warehouse NameWarehouse list for the selected warehouse connection will be displayed here. The user can select the warehouse from the list.

If Create New Warehouse is selected, configure the below fields:

Warehouse NameProvide a warehouse name for the new warehouse to be created.

Warehouse Configuration

Warehouse NameNew Warehouse name should be provided.
Warehouse SizePreferred warehouse size should be selected.
Maximum Cluster CountThe maximum number of clusters required for the warehouse should be specified.
Scaling PolicyThe scaling policy should be selected out of Standard or Economy.
Auto SuspendAuto suspend value for the warehouse should be provided (in seconds).
Auto ResumeSpecifies whether to automatically resume a warehouse when a SQL statement (e.g. query) is submitted to it.
CommentsOptional comments can be added for the warehouse.

Advanced Settings

Advanced settings will vary in case of batch and streaming data sources in the pipeline.

Advanced settings options with a Batch Data Source in pipeline:

Save ModeSpecifies how to handle the existing data in the target. The options are, Append and Truncate and Overwrite.
AppendContents of the schema will be appended to existing data/table.
Use External StorageIf the user is doing an external data transfer, and has to store the temporary data in S3, select the external storage option.
External StorageThe temporary data will get stored in S3.
Connection NameConnections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Amazon S3 earlier.
PathOptional. S3 file or directory path is to be given where the temporary data will get stored.
PurgeOptional. The purge option can be used to delete the intermediate data from S3.
Add configurationAdditional properties can be added using Add Configuration link.

Advanced settings options with a Streaming Data Source in pipeline:

Streaming Stage NameStage name to be used for creating stage in Snowflake.
Output ModeThe output mode to be used while writing the data to the Streaming sink.
AppendOutput Mode in which only the new rows in the streaming data will be written to the target.
Enable TriggerTrigger defines how frequently a streaming query should be executed.
Processing TimeProcessing Time is the trigger time interval in minutes or seconds. This property will appear only when Enable Trigger checkbox is selected.
ADD CONFIGURATIONAdditional properties can be added using Add Configuration link.
Top