Data Sources Introduction

A data source is the location where huge volumes of batch/streaming data that is being used originates from. It is the source data which is transformed using processors and ML algorithms that is ultimately emitted using a wide range of data warehouses. Gathr provides a range of data sources for its ETL and analytics functions.

Typically, you start creating a pipeline by selecting a Data Source for reading data. Data Source can also help you to infer schema from Schema Type which can directly from the selected source or by uploading a sample data file.

Gathr Data Sources are built-in drag and drop operators. The incoming data can be in any form such as message queues, transactional databases, log files and many more.

Data_Sources.png

Gathr runs on the Spark computation system and supports two types of Data Source’s behavior:

  • Streaming Data Sources

  • Batch Data Sources

The data sources supported by Gathr are listed as follows:

Top