Vertica Emitter

Vertica emitter supports Oracle, Postgres, MYSQL, MSSQL, DB2 connections.

You can configure and connect above mentioned DB-engines with JDBC. It allows you to emit data into DB2 and other sources into your data pipeline in batches after configuring JDBC channel.

For using DB2, create a successful DB2 Connection.

Vertica Emitter Configuration

To add a Vertica emitter to your pipeline, drag it onto the canvas and connect it to a Data Source or processor.

The configuration settings of the Vertica emitter are as follows:

FieldDescription
Connection NameAll Vertica connections are listed here. Select a connection for connecting to Vertica.
Message NameThe name of the message configuration which will act as metadata for the actual data.
Table NameExisting tablename of the specified database.
Is Batch EnabledEnable parameter to batch multiple messages and improve write performances.
Batch SizeBatch Size, which determines how many rows to insert per round trip. This can help the performance on JDBC drivers. This option applies only to writing. It defaults to 1000.
Connection RetriesNumber of retries for component connection. Possible values are -1, 0 or positive number. -1 denotes infinite retries.

If Routing Required =true, then:



Routing Policy - A json defining the custom routing policy. Example: {“1”:{“company”:{β€œGoogle”:20.0,“Apple”:80.0}}}



Here 1 is the timestamp after which custom routing policy will be active, ‘company’ is the field name and the value ‘Google’ takes 20% shards and value ‘Apple’ takes 80% shards.
Save ModeSave Mode is used to specify the expected behavior of saving data to a data sink.

ErrorifExist: When persisting data, if the data already exists, an exception is expected to be thrown.

Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data.

Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the Data.

Ignore: When persisting data, if data/table already exists, the save operation is expected to not save the contents of the Data and to not change the existing data.

This is similar to a CREATE TABLE IF NOT EXISTS in SQL.
Ignore Missing ValuesIgnore or persist empty or null values of message fields in sink.
Delay Between Connection RetriesDefines the retry delay intervals for component connection in milliseconds.
Enable TTLWhen selected, data will be discarded to TTL exchange specified.
Checkpoint Storage LocationSelect the checkpointing storage location. Available options are HDFS, S3, and EFS.
Checkpoint ConnectionsSelect the connection. Connections are listed corresponding to the selected storage location.
Checkpoint DirectoryIt is the path where Spark Application stores the checkpointing data.



For HDFS and EFS, enter the relative path like /user/hadoop/, checkpointingDir system will add suitable prefix by itself.

For S3, enter an absolute path like: S3://BucketName/checkpointingDir
Output ModeOutput mode to be used while writing the data to Streaming sink.

Append: Output Mode in which only the new rows in the streaming data will be written to the sink.

Complete Mode: Output Mode in which all the rows in the streaming data will be written to the sink every time there are some updates.

Update Mode: Output Mode in which only the rows that were updated in the streaming data will be written to the sink every time there are some updates.
Enable TriggerTrigger defines how frequently a streaming query should be executed.
Schema Results
Table Column NameName of the column populated from the selected Table.
Mapping ValueMap a corresponding value to the column.
Database Data TypeData type of the Mapped Value.
Ignore AllSelect the Ignore All check box to ignore all the Schema Results or select a checkbox adjacent to the column to ignore that column from the Schema Results.



Use Ignore All or selected fields while pushing data to emitter.



This will add that field as the part of partition fields while creating the table.
Auto FillAuto Fill automatically populates and map all incoming schema fields with the fetched table columns. The left side shows the table columns and right side shows the incoming schema fields.



If same field, as of table column, not found in incoming schema then the first field will be selected by default.
Download MappingIt downloads the mappings of schema fields and table columns in a file.
Upload MappingUploading the mapping file automatically populates the table columns and schema fields.
Top