Register H2O Models

Gathr provides its users with the option to register H2O POJO (Plain Old Java Objects) and MOJO (Model Object Optimization) models that are trained outside of Gathr. After the successful registration, the user can use the model to do predictions.

In the left navigation pane, click Register Entities. Click the Models tab.

To register a new model, click the + icon on the right side of the screen.

In the Register Model window, mention the following properties:

FieldDescription
NameName of the Model.
Model APIChoose H20 as API.

Model Format: Select model format MOJO (Model Object Optimized) or POJO (Plain Old Java Object)

H20 URLProvide the H20 URL i.e. the URL of the running H2O instance.
Model Type

The MOJO or POJO models can be registered with the below mentioned Spark ML algorithms. Choose the one that fits your use case.

1. Distributed Random Forest

2. Gradient Boosting Machine

3. Generalized Linear Modeling

4. Isolation Forest

Note: You will get different model types if you select ‘POJO’ i.e. - KMeans, NaiveBayes, Deep Learning, Distributed Random Forest, Gradient Boosting Machine, Generalized Linear Modeling

Scope

The user can select either Projects or Workspace as the scope of the model that is to be registered.

Note: The user can define the scope of the Model by selecting either Project or Workspace. If user selects workspace then, the created Model can be used across the Workspace. However, if the user selects Project as scope, then the Model will be visible only in the specific project.

POJO Model Type: Model Class**|This field is to mention the POJO Class name.

Model SourceRegister the H2O model either by uploading the zip file of MOJO models or the Java files of POJO models. If the files are placed on HDFS/ADLS/DBFS/S3, mention the HDFS/ADLS/DBFS/S3 connection and location.

If Model Source is selected as HDFS, then additional parameters will get displayed:

Connection NameChoose the HDFS connection name.
Override CredentialOption to override credentials for HDFS connection.
HDFS PathProvide the path where the model is located on HDFS.
ValidateValidates the uploaded model or located at the given HDFS location.

If Model Source is selected as DBFS, then additional parameters will get displayed:

Path

Browse the path where the model is stored.

Note: The DBFS option will be available for azure environment.

If Model Source is selected as ADLS, then additional parameters will get displayed:

Connection NameSelect the ADLS connection name for creating the connection.
Container NameProvide the ADLS container name.
PathBrowse the path where the model is stored.

If Model Source is selected as S3, then additional parameters will get displayed:

ConnectionSelect S3 connection name for creating the connection.
Bucket NameSelect the S3 bucket name for creating the connection.
PathBrowse the path where the model is stored.

Once, the model is validated successfully, the Register button next to Validate will be enabled.

Click Register. After the model is registered successfully, you can view the model on the models listing page.

Note: As shown below, the user can view the details of the listing page of the created Model including details such as Name, Type, Parent Project (the project in which the Model is created), Scope (Workspace/Project), Owner, etc.

RegisterH20-Listing05

Once the model is registered it can be utilized for scoring over H2O processor in pipelines over Gathr.

The user can create a version of the registered Model by clicking at the CREATE VERSION button.

CreateModelVersion

Note: To Validate as well as to Create Version ensure to use the below format for providing path:validate and create version we need to use below format for path: <foldername>/<model file name>

The user can Open, Enable Drift Detection, Deploy as Service, Export Model Version or Delete the created Model from the Actions column.

model-version

Model_EnableDataDrift

DeployModel

Under Actions tab user has an option to Export Model Version by providing the below inputs:

FieldDescription
Export Source

Option to update the export of the source to upload version of the registered model.

The available options are:

- Desktop

- HDFS

- ADLS

-DBFS

- S3

Upon selecting HDFS as Export Source option, provide the below fields:

ConnectionProvide the HDFS connection name.
HDFS PathProvide the HDFS path to upload the version of the registered model.

Upon selecting ADLS as Export Source option, provide the below fields:

ConnectionProvide the ADLS connection name.
Container NameProvide the ADLS container name.
ADLS PathProvide the ADLS Path for the registered model.

Upon selecting DBDS as Export Source option, provide the below fields:

Path

Browse the path where the selected model should be stored.

Note: The DBFS option will be available for azure environment.

If Model Source is selected as S3, then additional parameters will get displayed:

ConnectionSelect S3 connection name for creating the connection.
Bucket NameSelect the S3 bucket name for creating the connection.
PathBrowse the path where the selected model should be stored.

Note:

  • For Export Model Version use the format <foldername> to provide the path.

  • To register a H2O model, make sure the H2O server is running. To start the embedded H2O server, please refer the Installation guide.

Top