Custom Channel Data Source
In this article
Custom Data Source allows you to read data from any data source.
You can write your own custom code to ingest data from any data source and build it as a custom Data Source. You can use it in your pipelines or share it with other workspace users.
How to Create Custom Code Jar
Create a jar file of your custom code and upload it in a pipeline or as a registered component utility.
To write a custom code for your custom Data Source, follow these steps:
Download the Sample Project. (Available on the home page of Data Pipeline).
Import the downloaded Sample project as a maven project in Eclipse. Ensure that Apache Maven is installed on your machine and that the PATH for the same is set on the machine.
Implement your custom code and build the project. To create a jar file of your code, use the following command:
mvn clean install –DskipTests
For a Custom Data Source, add your custom logic in the implemented methods of the classes as mentioned below:
High-level abstraction
If you want high level, abstraction using only Java code, then extend BaseSource as shown in SampleCustomData Source class
com.yourcompany.component.ss.Data Source.SampleCustomData Source which extends BaseSource
Methods to implement:
public void init(Map<String, Object> conf)
public List<String> receive()
public void cleanup()
Low-level abstraction
If you want low-level implementation using spark API, then extend AbstractData Source as shown in SampleSparkSourceData Source class.
com.yourcompany.component.ss.Data Source.SampleSparkSourceData Source extends AbstractData Source
Methods to implement:
public void init(Map<String, Object> conf)
public Dataset<Row> getDataset(SparkSession spark)
public void cleanup()
- Configuring Custom Data Source
While uploading data you can also upload a Dataset in Custom Data Source.
Field | Description |
---|---|
Data Source Plugin | Fully qualified name of a custom code class. |
Upload the custom code jar using the Upload jar button from the pipeline designer page.
You can use this Custom Data Source in any pipeline.
- Click Done to save the configuration.
If you have any feedback on Gathr documentation, please email us!