Continuous Integration Continuous Delivery

Gathr supports continuous integration and continuous delivery of pipeline, in form of test suites and by deploying pipelines on destination.

Continuous Integration

Continuous integration is done by executing test cases (as part of a test-suite) on the incremental state of a pipeline that ensures the pipeline is not impacted by new changes

Test Suite is a collection of Test Cases associated with a pipeline.

You may want to create a test case to verify multiple permutations on continuous data which will then be deployed as soon as they pass the tests.

How to create a test case?

A test case comprises of a set of source data files (one data file for each Data Source) and inspection outputs of each component in the pipeline. These inspection outputs of each component can be thought as the expected data for the component

Every test case will have its source data associated with each Data Source. When a test suite is executed, all its associated test cases are executed too. A test suite run is uniquely identified by a run id. When a test case is executed as part of a run id, each component is inspected again, and the output data can be referred as the actual data.

How a test case is evaluated to pass or fail?

A test case is passed if both the conditions mentioned below are fulfilled:

  1. Number of records are the same for actual and expected data.

  2. Every field value of each record in actual data matches with the expected data.

Let’s take an example to understand this.

Your source data may have integers, or some blank values and you may want to perform certain tests to make sure that the invalid or null values are omitted. These test cases can be created with different source data files to test.

To create a test case, you need to configure, save and inspect the pipeline.

Create Test Case

Once the pipeline is saved and inspected, a test case can be created. Go to the menu on the data pipeline canvas, as shown below:

testsuite

When you click on Test Suite, it opens a New Test Case windo. Click CREATE button.

createtestcase

To create a new test case, enter details in the New Test Case window fields, as shown below:

PropertyDescription
Details

Details about the Test case, which is:

Name: Name of the test case.

Two test cases cannot have the same name, within a pipeline that names must be unique.

Description: Enter a description for the test case (optional)

Do Not Compare

Select the columns of either component or the component itself, which will not be compared and tested during inspection. The test case execution will skip data comparison of these component field(s) and component(s). Components, the test case will run on, will save the output file (schema) of the component and compare or not compare a column for the script that runs on the component.

For example: The test case is to check the entries of employees, where a field of Random User ID is generated every time an action happens. This random UID will be different in every field. Therefore, it will fail the test.

Here you can use to not compare either the output field or the component itself.

deatilstest

donotcompare

Click on Create, and the success notification message will be shown, with a new entry in the table, as shown below:

savedtestcase

Test Suite Properties

Every test case is listed in a tabular form under the components of Data pipeline canvas.

This table has two windows (shown above), Definition and Run History with properties defined below

Definition

The definition tab will show all the details and enables you to create a test case, execute it, run multiple at the same time, and other actions as defined below:

PropertyDescription
Status

Status signifies the last execution status of a test case. To view the Status of the of test case, i.e., last run status, hover the mouse over the status icon.

This shows the status of the test case with a RUN ID and the duration of execution.

Possible values are ‘never executed’, ‘In progress’, ‘Success’, ‘Failure’, and ‘Error.

When the Test case are running, In Progress is denoted by yellow status icon. In case of completion, Success is notified by green status icon and Failure/Error is notified via a Red Exclamation.

Test Case NameName of the test case.
DescriptionA description of the test case. This is an optional property.
Source File(s)The source file used for test creation.
CoverageThe percentage of the covered components under the test case configuration. (Shown below the table with image and more detail)
ActionsThe actions that can be defined on the test case. (Click for more details)
Refresh TableRefresh Test case list allows you to refresh the test case list.
Run selected test casesOnce all the test case are created you can run the selected ones or all with just one click.
Create newTo create new test cases.
Update existingEdit the existing selected test case and re configure the fields. The same can be done by using Configure under Actions.
SearchSearch for any test case suing the search column.

Under the Coverage tab, hover to see complete configuration of the test case.

Coverage

Actions on use case

actions

PropertyDescription
ConfigureEdit the configuration by clicking on Configure button of test case.
Load DataIt allows you to load pipeline’s Data Source(s) data and run inspection. Any inconsistency caused due to changes in existing pipeline design will be shown by alert icon in Status column. You need to Load Data again and update existing test-case to sync that test case with latest configuration of pipeline.
Download

Download the uploaded sample data file(s) via clicking on Download icon.

Click on test case Download button and a new window opens with every component’s output data, which can be downloaded.

DeleteDelete any test case via Delete button

Run History

You can view the test case Run History. The table will show the run statistics along with test case outputs. View the reason of failed test cases.

runhistory

Stale Test Case

Multiple test cases can be configured and re-configured from the same window. Now, there may be cases when you have re-configured the pipeline and a few components have been removed.

On removing any component from existing pipeline and then creating new test case may lead to some inconsistencies to the test cases created earlier.

In case of Stale Test Cases, such inconsistencies are shown as a warning or error under the Status of the test case.

Click on Load Data again and update existing test-case to sync that test case with latest configuration of pipeline.

Each entry corresponds to a single Test Suite run. Single test suite run may contain one or more test cases.

Download Test Results

Complete test suite results can be downloaded via Download Test Results, under Results < Run History. The Test Report is a HTML file with the run Id and test run details (shown below):

results

testsuiterunreport1

Individual test case result can be downloaded via Download Output Results, corresponding to Duplicate Ids. The output result is in the JSON file format.

Updated Test Cases

If you want to update or the source file of expected result, update test case enables the same. Updating existing test case data will overwrite previous captured inspection data with new inspection data. You can also update the Do Not Compare section, while editing the existing pipeline.

updateexisting

Version Control

Version control allows you to create a version of the Test Suite. Gathr supports two types of version control system:

  • GIT

  • Gathr Metastore

Before understanding how version control works, we need to understand what a Working Copy is.

A Working Copy is the current version of the copy or the first (0) version of the copy.

The pipeline editing is performed on the working copy, always.

To create a version:

First way is when you re-save an edited pipeline, you get an option to Create Version.

createdefinition

Second way to enable version on your pipelines is when you edit and save the pipeline. Amongst the option of saving the pipeline is Create Version and Switch version button.

switchandcreteversion

Third option to create a version is by the pipeline tile.

committogit-ActionsonPipeline

For option 2nd and 3rd, the window show below opens.

createversion

Mention a description or a comment for the version and click on Create. Once you save the version, the last version on the pipeline changes from ‘Working Copy’ to Version 1.

A pipeline remains a working copy as long as no other version is created. Once you save a version, next to the title ‘Create Subsystem version’, is the notification of the next version that will be created.

When a new superuser logs in, they will have to configure the Version Control properties under Version Control Setup →.

Switch Version

If you want to switch the version of a pipeline, click on Switch Version on the pipeline editor page and choose a version. It will change the pipeline as per the selected version.

It is the Working Copy that is loaded to a newer version. Editing is always performed on the Working Copy of the pipeline.

Download Version

Download Version from pipeline tile allows you to download a version of the pipeline.

versionList

Click on the Download arrow to download the pipeline.

PropertyDescription
VersionVersion number of the pipeline.
Commit MessageCommit message description.
ArtifactsThe artifact uploaded.
Commit TimeThe commit time that the pipeline was committed as a version.
DownloadDownload the pipeline.

Continuous Delivery

While you continuously integrate data in the pipelines, you can also deliver the same.

Continuous delivery requires configuration of three environments:

  • Source Environment

  • Test Environment

  • Destination Environment

Source environment is where the pipeline is incrementally developed.

The CD scripts download the pipeline from source environment and upload it on the Test Environment.

Then on the Test environment, test suite for the pipeline is executed. Based on user selection, a inspect session is created and then on test environment the test suite of the pipeline is executed.

If the test suite passes the execution, the pipeline is then promoted to the Destination environment*.*

Prerequisites

Prerequisites for CD script

  1. Installation of jq

  2. SMTP server configured

    Open the folder <gathr_home>/bin/CD_script and run StartCD.sh to run the pipeline on the destination environment. Integration is done by creating Test Suites and delivery is performed by running the StartCD.sh on Gathr metastore.

    Configure the below mentioned properties to deliver the test cases on the environment:

    CD1

    PropertyDescription
    sax.testserver.user.tokenThe token of workspace where the pipeline is uploaded.
    sax.testserver.urlGathr URl for test environment.
    sax.testserver.testsuite.executePossible values are 1 or 0. If test suite execution is required or not.
    sax.testserver.inspect.typeInspect session to be used for test suite execution. Possible values- local.
    sax.destination.user.tokenGathr user token for destination. It can be obtained by log in as workspace user> Manage user> Edit User.
    sax.destination.urlGathr URL for destination.
    sax.source.pipeline.namePipeline name to promote.
    sax.source.pipelline.versionPipeline version to promote.
    sax.promotion.skiponfailureSpecify if pipeline promotion should be stopped on test suite failure.
    sax.promotion.emailadress.listSpace separated list of email ids where report will be sent.
Top