Azure SQL ETL Source

Azure SQL data source reads the objects from Azure SQL and SQL Server database.

Schema Type

See the topic Provide Schema for ETL Source → to know how schema details can be provided for data sources.

After providing schema type details, the next step is to configure the data source.

Data Source Configuration

Connection Name: Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Azure SQL earlier. Or create one as explained in the topic - Azure SQL Connection →

Schema Name: Source Schema name for which the list of table will be viewed.

Table Name: Source table name to be selected for which you want to view the metadata.

Query: Hive compatible SQL query to be executed in the component.

Design Time Query: Query used to fetch limited records during Application design. Used only during schema detection and inspection.

Enable Query Partitioning: This enables parallel reading of data from the table. It is disabled by default.

Tables will be partitioned if this check-box is enabled.

If Enable Query Partitioning is check marked, additional fields will be displayed as given below:

No. of Partitions: Specifies the number of parallel threads to be invoked to partition the table while reading the data.

Partition on Column: This column will be used to partition the data. This has to be a numeric column, on which spark will perform partitioning to read data in parallel.

Lower Bound: Value of the lower bound for partitioning column. This value will be used to decide the partition boundaries. The entire dataset will be distributed into multiple chunks depending on the values.

Upper Bound: Value of the upper bound for partitioning column. This value will be used to decide the partition boundaries. The entire dataset will be distributed into multiple chunks depending on the values.

If Enable Query Partitioning is disabled, then proceed by updating the following field.

Fetch Size: The fetch size determines the number of rows to be fetched per round trip. The default value is 1000.

Detect Schema

Check the populated schema details. For more details, see Schema Preview →

Incremental Read

Optionally, you can enable incremental read. For more details, see Azure SQL Incremental Configuration →

Pre Action

To understand how to provide SQL queries or Stored Procedures that will be executed during pipeline run, see Pre-Actions →)

Notes

Optionally, enter notes in the Notes → tab and save the configuration.

Top