Data Validation Introduction

When data is moved from one source to another, we are often not sure whether our data has completely moved. During the process of data movement we might introduce unintentional business logic or encounter errors, resulting in partial movement of data. In these kind of scenarios, even though the process completes there might still be discrepancies in the source and target stores.

Data Validation help users compare to data stores and helps in understanding if there is any mismatch between the expected and actual records in the target destination.

Thus, as a part of ETL solution, Data Validation solves the problem of comparing two data sources. In our application, these data sources are termed as entities. The user can configure the validations and execute those to see the comparative results.

As a part of ETL solution, Data Validation solves the problem of comparing two data sources. The data sources are termed as entities. The user can configure the validations and execute those to see the comparative results.

Note:

The user can create multiple validations and group them into a single job.

As a part of ETL solutions, the user can now create the Data Validation jobs in Gathr. Essentially, the user can view it in job listing page and run it to get a comprehensive comparative report.

Let’s begin to look into the details of the functionality below:

Data Validation page

The user can create and view existing data validations in the Data Validation page of Gathr. In this page you can find the below options:

individualrunSelect

FieldDescription
NameThe user can see the data validation job names under this field. The user can expand the collapsible icon to view the number of validations created, individual status and details of the validation entity name(s), type and entity source configuration.
Modified DateThe last date and time of job modification.
Run Status (Running NA)

The run status (NA, starting, running, completed/partially completed/stopped/error) of the job is depicted here.

Note: If out of multiple validations in a job at least 1 validation gets completed successfully but the rest validations, go in error mode the partially completed.

Validation Status (NA)The user can view the passed and failed validation status of the job. Green and orange symbolizes pass and fail respectively.
DescriptionThe details of the job validations are depicted here.
Action

Within Action the user can play/stop, refresh run status and view cluster configuration, view results, edit, run history, Configure Job and delete a validation job.

To know more about the configure job option, see the Configure Job field in the table Actions on Pipeline

Note: History will show all the run history records depicting every validation status.

Top