Azure Blob Batch Source

An Azure Blob Batch channel reads different formats of data in batch (json, csv, orc, parquet) from container. It can omit data into any emitter.

Configuring an Azure Blob Data Source

To add an Azure Blob Data Source into your pipeline, drag the Data Source to the canvas and right click on it to configure.

Schema Type

See the topic Provide Schema for ETL Source → to know how schema details can be provided for data sources.

After providing schema type details, the next step is to configure the data source.

Data Source Configuration

Configure the data source parameters as explained below.

Connection Name Select the connection name from the available list of connections, from where you would like to read the data.

Container Container name in Azure Blob.

Path End path with * in case of directory.

For example:- outdir.*. For Absolute path:- outdir/filename

File Filter Provide a file pattern example: *csv/*json to retrieve the available files.

Recursive File Lookup Check the option to retrieve the files from current/sub-folder(s).

Add Configuration To add additional custom properties in key-value pairs.

Pre Action

To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Pre-Actions →).

Top