Vertica Emitter

Vertica emitter supports Oracle, Postgres, MYSQL, MSSQL, DB2 connections.

You can configure and connect above mentioned DB-engines with JDBC. It allows you to emit data into DB2 and other sources into your data pipeline in batches after configuring JDBC channel.

👉

This is a batch component.

For using DB2, create a successful DB2 Connection.

Vertica Emitter Configuration

To add a Vertica emitter to your pipeline, drag it onto the canvas and connect it to a Data Source or processor.

👉

If the data source in pipeline has a streaming component, then the emitter will show four additional properties, Checkpoint Storage Location, Checkpoint Connections, Checkpoint Directory, and Time-Based checkpoint.

The configuration settings of the Vertica emitter are as follows:

Field	Description
Connection Name	All Vertica connections are listed here. Select a connection for connecting to Vertica.
Message Name	The name of the message configuration which will act as metadata for the actual data.
Table Name	Existing tablename of the specified database.
Is Batch Enabled	Enable parameter to batch multiple messages and improve write performances.
Batch Size	Batch Size, which determines how many rows to insert per round trip. This can help the performance on JDBC drivers. This option applies only to writing. It defaults to 1000.
Connection Retries	Number of retries for component connection. Possible values are -1, 0 or positive number. -1 denotes infinite retries. If Routing Required =true, then: Routing Policy - A json defining the custom routing policy. Example: {“1”:{“company”:{“Google”:20.0,“Apple”:80.0}}} Here 1 is the timestamp after which custom routing policy will be active, ‘company’ is the field name and the value ‘Google’ takes 20% shards and value ‘Apple’ takes 80% shards.
Save Mode	Save Mode is used to specify the expected behavior of saving data to a data sink. ErrorifExist: When persisting data, if the data already exists, an exception is expected to be thrown. Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data. Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the Data. Ignore: When persisting data, if data/table already exists, the save operation is expected to not save the contents of the Data and to not change the existing data. This is similar to a CREATE TABLE IF NOT EXISTS in SQL.
Ignore Missing Values	Ignore or persist empty or null values of message fields in sink.
Delay Between Connection Retries	Defines the retry delay intervals for component connection in milliseconds.
Enable TTL	When selected, data will be discarded to TTL exchange specified.
Checkpoint Storage Location	Select the checkpointing storage location. Available options are HDFS, S3, and EFS.
Checkpoint Connections	Select the connection. Connections are listed corresponding to the selected storage location.
Checkpoint Directory	It is the path where Spark Application stores the checkpointing data. For HDFS and EFS, enter the relative path like /user/hadoop/, checkpointingDir system will add suitable prefix by itself. For S3, enter an absolute path like: S3://BucketName/checkpointingDir
Output Mode	Output mode to be used while writing the data to Streaming sink. Append: Output Mode in which only the new rows in the streaming data will be written to the sink. Complete Mode: Output Mode in which all the rows in the streaming data will be written to the sink every time there are some updates. Update Mode: Output Mode in which only the rows that were updated in the streaming data will be written to the sink every time there are some updates.
Enable Trigger	Trigger defines how frequently a streaming query should be executed.
Schema Results
Table Column Name	Name of the column populated from the selected Table.
Mapping Value	Map a corresponding value to the column.
Database Data Type	Data type of the Mapped Value.
Ignore All	Select the Ignore All check box to ignore all the Schema Results or select a checkbox adjacent to the column to ignore that column from the Schema Results. Use Ignore All or selected fields while pushing data to emitter. This will add that field as the part of partition fields while creating the table.
Auto Fill	Auto Fill automatically populates and map all incoming schema fields with the fetched table columns. The left side shows the table columns and right side shows the incoming schema fields. If same field, as of table column, not found in incoming schema then the first field will be selected by default.
Download Mapping	It downloads the mappings of schema fields and table columns in a file.
Upload Mapping	Uploading the mapping file automatically populates the table columns and schema fields.

If you have any feedback on Gathr documentation, please email us!

Vertica Emitter

Vertica Emitter Configuration #

Vertica Emitter Configuration