Elasticsearch Data Asset Source

Create a Data Asset Through Elasticsearch

To create a data asset through Elasticsearch Source, configure parameters as follows:

Connection Name

Connections are the service identifiers.

A connection name can be selected from the list if you have created and saved connection details for Elasticsearch earlier.

Or create one as explained in Elasticsearch Connection → topic.


Entity

Select the entity Object/Model from the provided list. The query will load the dataset from the source using the selected entity.


Fields

Select columns or provide a custom SQL query.

Select Fields: Select the names of the columns to be queried.

Custom Query: Provide a SQL query specifying the read conditions.


Maximum Records to Fetch

Specify the maximum number of sample records you wish to keep in the data asset.

This feature helps in obtaining a manageable subset of data for testing and design purposes, facilitating efficient application development while optimizing resource usage.


Sampling Method

This option offers flexibility in how you retrieve sample data.

Following are the ways:

  • Top N: Retrieve the specified number of initial records from the data source based on the specified maximum number of rows. This is particularly useful when you want to analyze or design with a specific set of initial records.

  • Random Sample: Fetch a random subset of records from your sample data, ensuring a diverse representation. This approach is valuable when you require a more comprehensive assessment of your data’s characteristics.

Advanced Configuration

Fetch Size

The number of rows retrieved and kept in memory in a single batch from source.

Choosing the right fetch size can greatly affect the pipeline performance and memory usage.

Top