Neo4j ETL Target
Neo4j ETL Target allows you to emit the transformed data into your Neo4j accounts.
Target Configuration
Configure the data emitter parameters as explained below.
Connection Name
Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Neo4j earlier. Or create one as explained in the topic - Neo4j Connection →
Use the Test Connection option to ensure that the connection with the Neo4j channel is established successfully.
A success message states that the connection is available. In case of any error in test connection, edit the connection to resolve the issue before proceeding further.
Database Name
Database name to which data will be added in Neo4j emitter.
Write Mode:
Option to write the data on database emitter. The available options are:
Cypher Query
Nodes
Relationships
Cypher Query
Persist the entire dataset by using the provided cypher query. Example: CREATE (n:Person {fullName: event.name + event.surname}).
Node
Persist the entire dataset as nodes. The nodes are sent to Neo4j in a batch of rows defined in the batch size field.
Node Keys
The key:value pairs, where the key is the DataFrame column name and the value is the node property name.
Relationship
Gathr provide option to define type of relationship. For example: You can, in a given dataset move the columns around to create source and target nodes, eventually creating the specified relationships between them.
Relationship Save Strategy
There are two strategies you can use to write relationships: Native (default strategy) and Keys.
Relationship Properties
Map used as keys for specifying the relationship properties. Used only if the Relationship Save Strategy keys type option is selected.
Note: Relationship Properties field will appear if Relationship Save Strategy is set to ‘Keys’.
Source Label
Colon separated list of the labels to attach to the node.
Source Save Mode
Source Node save mode to be selected out of the below options:
Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data.
Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the Data.
Match: Match mode performs a match.
Source Node Keys
Map used as keys for matching the source node.
Source Node Properties
Map used as keys for specifying the source properties. Used only if the Relationship Save Strategy keys type option is selected.
Note: Source Node Properties field will appear if Relationship Save Strategy is set to ‘Keys’.
Target Labels
Colon-separated list of labels that identify the target node.
Target Save Mode
Target Save Mode is used to specify the expected behavior of saving data to a data sink.
Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data.
Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the Data.
Target Node Keys
Map used as keys for matching the target node.
Target Node Properties
Map used as keys for specifying the target properties. Used only if the Relationship Save Strategy keys type option is selected.
Note: Target Node Properties field will appear if Relationship Save Strategy is set to ‘Keys’.
Parameters common for Cypher Query, Nodes, and Relationships.
Batch Size
The number of rows sent to Neo4j as batch. Default is 5000.
Save Mode
Save Mode is used to specify the expected behavior of saving data to a data sink.
Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data.
Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the Data.
Output Fields
Fields in the message that needs to be a part of output data can be selected from the drop-down list.
Priority
Option to define the execution order for the emitter.
Parameters that only appear for a streaming data source.
Enable Trigger
Trigger defines how frequently a streaming query should be executed.
Processing Time
It will appear only when Enable Trigger checkbox is selected.
Processing Time is the trigger time interval in minutes or seconds.
Add Configuration: Additional properties can be added using this option as key-value pairs.
Post Action
To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Post-Actions →
Notes
Optionally, enter notes in the Notes → tab and save the configuration.
If you have any feedback on Gathr documentation, please email us!