Processor Group

Processor Group or pipeline sub-flow is a chain of transformations used in a single unit that can be reused in different pipeline(s) as component(s).

Example: In certain business scenarios user may want to re-use a set of processor(s) combination(s) like union and limit processors (as show in the below image) in several pipelines. The processor group thus eliminates redundancy of selecting every single processor repeatedly; thus, reducing the pipeline creation time.ProcessorGrouponCnvas

Steps to Create a Processor Group

To enter the Processor Group listing page; select Processor Group from the Project home page.

To create a new processor group, click the + icon at the top right side of the screen.CreateProcessorGroup1

You will land at the processor group canvas page.

To create a new processor group, select the input channel(s), choose transformation(s), select the output emitter. CreateProcessorGroup2

Steps to create a processor group:

1 . Select input channel and configure it:CreateProcessorGroup3

CreateProcessorGroup4

If your data contains header, check mark the header included option and upload the data file. Click Next to detect schema. Save notes, if required and click Next.

2 . Select processor(s) from the Processors/Analytics tab under components panel and configure it.

3 . Select an output emitter. CreateProcessorGroup5

Save your processor group.ProcessorGroupScope

Field

Description

Name

Define the processor group by mentioning the name.

Tag

User may add tags.

Scope

The user can select scope as either Project or Workspace.

Note:

- If the user selects Workspace as scope then the created Processor Group will be visible across Project(s) of that Workspace.

- If the user selects the scope as Project, then the particular Processor Group will be visible within the Project only.

Comment

User may add the associated comments. (Click Save).


Note: To create a processor group you require at least one input channel, processor and emitter.

You can also edit or delete the processor groups by clicking on the edit or delete icon against the created processor group on the processor group listing page.

- While importing a pipeline with processor group, its default scope would be Project.

List of unsupported processors:-

Custom

Scala

Python

ML Models (Spark ML)

H20 Models

Python Model

PMML Model

Note: Processor Group Import/Export is currently not supported.

Steps to Use a Processor Group

To use a Processor Group created in Gathr, navigate to Pipeline tab and do as follows:

l Create or edit a pipeline on the pipeline canvas and add the Processor Group processor after each required component for which you want to add the Processor Group.

l On the Configuration tab of the Processor Group processor, select the Processor Group name that you want to apply.

l Configure additional schema details if required and click Next till you reach the Notes tab.

l Add notes if any and click on Done.

Use_ProcessorGroup

For more details about Processor Group processor, see Processor Group.