ADLS Data Asset Source
Create a Data Asset Through ADLS
To create a data asset through ADLS Source, configure parameters as follows:
Connection Name
Connections are the service identifiers.
A connection name can be selected from the list if you have created and saved connection details for ADLS earlier.
Or create one as explained in ADLS Connection → topic.
Container
Select the container from the provided list. All containers are listed are from the selected connection.
ADLS Directory Path
Provide the directory path of the ADLS file system.
File Filter
Provide a file pattern example: *csv/*json to retrieve the available files.
Recursive File Lookup
Check the option to retrieve the files from current/sub-folder(s).
Data Format
Choose the format of your source data. If CSV data format is selected, delimiter should be specified.
Source File Contains Headers
Enable or disable the scanning of first row as a header for CSV files. Disabled by default.
Maximum Records to Fetch
Specify the maximum number of sample records you wish to keep in the data asset.
This feature helps in obtaining a manageable subset of data for testing and design purposes, facilitating efficient application development while optimizing resource usage.
Sampling Method
This option offers flexibility in how you retrieve sample data.
Following are the ways:
Top N: Retrieve the specified number of initial records from the data source based on the specified maximum number of rows. This is particularly useful when you want to analyze or design with a specific set of initial records.
Random Sample: Fetch a random subset of records from your sample data, ensuring a diverse representation. This approach is valuable when you require a more comprehensive assessment of your data’s characteristics.
If you have any feedback on Gathr documentation, please email us!