Cassandra ETL Target
Cassandra ETL Target allows you to emit the transformed data into your Cassandra accounts.
Target Configuration
Configure the data emitter parameters as explained below.
Connection Name
Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Cassandra earlier. Or create one as explained in the topic - Cassandra Connection →
Use the Test Connection option to ensure that the connection with the Cassandra channel is established successfully.
A success message states that the connection is available. In case of any error in test connection, edit the connection to resolve the issue before proceeding further.
KeySpace
Define a new or existing KeySpace and its replication strategy.
Output Fields
Fields in the message that should be a part of the output data.
Ignore Missing Values
Ignore or persist empty or null values of message fields in emitter.
Key Columns
A single/compound primary key consists of the partition key and one or more additional columns that determine clustering.
Table Name Expression
jsexpression used to evaluate table name. The KeySpace will be formed as ns_ + {tenantId}
Example: ns_1
Use field aliases instead of field names in expression when you want to perform field based partitioning.
Consistency Level
Consistency Level refers to how up-to-date and synchronized a row of Cassandra data is on all of its replicas.
ALL: Requires all replicas to respond, ensuring the highest consistency but lowest availability.
EACH_QUORUM: Requires a quorum of replicas in each datacenter to respond, used for strong consistency across multiple datacenters.
QUORUM: Requires a majority of replicas across all datacenters to respond, balancing consistency and availability.
LOCAL_QUORUM: Requires a majority of replicas in the local datacenter to respond, reducing inter-datacenter latency.
ONE: Requires one replica to respond, offering the highest availability but lowest consistency.
TWO: Requires two replicas to respond, providing a middle ground between ONE and THREE.
THREE: Requires three replicas to respond, offering higher consistency than ONE and TWO.
LOCAL_ONE: Requires one replica in the local datacenter to respond, avoiding cross-datacenter latency.
SERIAL: Ensures linearizable consistency for lightweight transactions, requiring a quorum for reads.
LOCAL_SERIAL: Similar to SERIAL but confined to the local datacenter, ensuring linearizable consistency within a single datacenter.
Replication Strategy
A replication strategy specifies the implementation class for determining the nodes where replicas are placed. Possible strategies are ‘SimpleStrategy’ and ‘NetworkTopologyStrategy’.
Replication Factor
Defines how many copies of the data will be present in the cluster.
Enable TTL
Select TTL which limits the lifetime of data.
TTL Value
Provide TTL value in seconds.
Batch Size
Number of Records to pick for inserting into Cassandra.
Save Mode
Save Mode is used to specify the expected behavior of saving data to Cassandra.
ErrorifExists: When persisting data, if the data already exists, an exception is expected to be thrown.
Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data.
Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the Data.
Ignore: When persisting data, if data/table already exists, the save operation is expected to not save the contents of the Data and to not change the existing data.
Schema will list as per the configured connection.
Select the schema to emit the data.
Add Configuration: Additional properties can be added using this option as key-value pairs.
Post Action
To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Post-Actions →
Notes
Optionally, enter notes in the Notes → tab and save the configuration.
If you have any feedback on Gathr documentation, please email us!