Workflow Configuration

Workflow Definition

Once a workflow is defined, provide a workflow name and click on CREATE button to create the workflow.

Workflow_Definition

Once a workflow is created it will be visible on the Workflows listing page.

A Workflow tile has the following fields:

Workflows_Listing

Subworkflow: If a workflow has a subworkflow in it, the icon will show the subworkflow’s count.

This is also called parent-child marker, that shows; if a workflow has zero subworkflows then it is a child subworkflow.

Pipeline: If a pipeline is used in the workflow, the icon shows the number of pipelines used.

Last Modified: The duration since the workflow was last modified.

Next Schedule: The next trigger time for the workflow will be displayed if the workflow is scheduled.

Trigger: Once a workflow is created, deploy the workflow to Airflow by clicking on the Trigger button.

View Summary: Option to view the workflow’s run history.

Schedule: Option to schedule the workflow.

Refresh: Option to refresh the workflow.

Edit: Option to edit the workflow.

Delete: Option to delete the workflow and send it to Gathr’s Recycle Bin.

Once you have defined and created a workflow, user can perform the following operations on workflow:

Edit Workflow

You can edit a Workflow from the Workflows listing page.

A workflow cannot be edited when it is active.

Make sure to update and save the workflow from canvas once the editing is completed.

Subworkflow

A workflow can have another workflow in it. This enables the Parent Child marker in the workflows.

A Parent Marker will be shown with a Parent Workflow marker icon on the workflow, similarly for a Child Workflow marker.

On every workflow listing, a count of number of pipelines and subworkflow is shown.

Schedule Workflow

Once workflow is created, deploy the workflow by clicking on SCHEDULE button. Set your workflow scheduling configuration as defined below:

Workflow_Scheduling

FieldDescription
Start DateDate from which workflow starts.
End DateDate on which workflow ends. (not mandatory)
Time ZoneTime Zone in which workflow should be scheduled.
FrequencyRuns workflow after this interval. You can select a predefined interval. If you opt for None option for scheduled interval, you can trigger the workflow.

Every - hour(s), week(s), month(s) value can be provided, as per the specified interval of frequency.

For Hourly frequency, an additional option Start Time will be available to provide the start time in hours and minutes.
Cron ExpressionCron expression is generated based on frequency.
Number of RetriesNumber of times Airflow restarts workflow (in case of failure)

Retry Delay: If number of retries is greater than 1 or above, then provide a value in Retry Delay (in seconds), which enables the workflow to try restarting, after a certain time of delay.

Email on Retry: You can send an email if the workflow retry is attempted, by setting its value to True or not by using False.
Depends on PastSet to True, if current run of workflow depends upon its last run. Default is False.
Wait For DownstreamIn case of True: A scheduled run will only be executed if a previously scheduled downstream task is complete. In case the downstream task has failed the scheduled task will wait unless the downstream task is complete.

In case of False: The scheduled task will run irrespective of the downstream task.
Max Active RunsOption to specify the number of instances that can run concurrently for this Workflow.
RFCUser can use this property as environment variable in bash operator i.e., $RFC.
CheckpointIDUser can use this property as environment variable in bash operator i.e., $CheckpointID.

Un-schedule Workflow

To remove a workflow from a schedule, click on the SCHEDULING option. This option is available on the Workflow tile under the menu button.

A new window will pop up and you can un-schedule the workflow.

You can always Reschedule the workflow, using the same steps.

Start Workflow

After scheduling workflow, status of the workflow will change to SCHEDULED. Now, you can start the workflow by clicking on Trigger button. If the workflow starts successfully, the status will change to ACTIVE.

View Summary

Workflow monitoring feature allows you to:

  • View the workflow run history.

  • View the component-level status.

  • Re-trigger a failed task.

  • Re-trigger all the failed tasks with a single click.

Workflow_Summary

The view summary page will take you to the dashboard of workflow monitoring where you can see the last 5 runs and their details.

FieldDescription
Run IDThe run ID defines the current Workflow.
StatusThe current position of the workflow.
Start TimeThe planned time for starting the workflow.
End TimeThe planned time for ending the workflow.
DurationThe time duration of the workflow run.
Task StatusThe status of the workflow run.

Upon clicking the Run ID, the user can view the workflow summary. If the workflow fails, the monitor window will help you resume and run the workflow from the failed stage.

Within the Workflow summary DAG window, you can perform the below functions:

Workflow_Retrigger_Failed_Tasks

FieldDescription
ReloadThe reload tab lets the user refresh and reload the status to verify the workflow running stage.
Re-TriggerThe user can re-trigger a failed task by clicking on it.
Re-Trigger Failed TaskIn case of a task failure, the user can re-trigger it all failed tasks by clicking on it.

Pause Workflow

Once the workflow is in ACTIVE state, you can PAUSE the workflow from the Workflows listing page.

Resume Workflow

You can RESUME a paused workflow.

Delete Workflow

Click on Delete button, to delete a workflow and send it to Gathr’s Recycle Bin.

Top