Register H2O Models
Gathr provides its users with the option to register H2O POJO (Plain Old Java Objects) and MOJO (Model Object Optimization) models that are trained outside of Gathr. After the successful registration, the user can use the model to do predictions.
In the left navigation pane, click Register Entities. Click the Models tab.
To register a new model, click the + icon on the right side of the screen.
In the Register Model window, mention the following properties:
Field | Description |
---|---|
Name | Name of the Model. |
Model API | Choose H20 as API. |
Model Format: Select model format MOJO (Model Object Optimized) or POJO (Plain Old Java Object)
H20 URL | Provide the H20 URL i.e. the URL of the running H2O instance. |
Model Type | The MOJO or POJO models can be registered with the below mentioned Spark ML algorithms. Choose the one that fits your use case. 1. Distributed Random Forest 2. Gradient Boosting Machine 3. Generalized Linear Modeling 4. Isolation Forest Note: You will get different model types if you select ‘POJO’ i.e. - KMeans, NaiveBayes, Deep Learning, Distributed Random Forest, Gradient Boosting Machine, Generalized Linear Modeling |
Scope | The user can select either Projects or Workspace as the scope of the model that is to be registered. Note: The user can define the scope of the Model by selecting either Project or Workspace. If user selects workspace then, the created Model can be used across the Workspace. However, if the user selects Project as scope, then the Model will be visible only in the specific project. |
POJO Model Type: Model Class**|This field is to mention the POJO Class name.
Model Source | Register the H2O model either by uploading the zip file of MOJO models or the Java files of POJO models. If the files are placed on HDFS/ADLS/DBFS/S3, mention the HDFS/ADLS/DBFS/S3 connection and location. |
If Model Source is selected as HDFS, then additional parameters will get displayed:
Connection Name | Choose the HDFS connection name. |
Override Credential | Option to override credentials for HDFS connection. |
HDFS Path | Provide the path where the model is located on HDFS. |
Validate | Validates the uploaded model or located at the given HDFS location. |
If Model Source is selected as DBFS, then additional parameters will get displayed:
Path | Browse the path where the model is stored. Note: The DBFS option will be available for azure environment. |
If Model Source is selected as ADLS, then additional parameters will get displayed:
Connection Name | Select the ADLS connection name for creating the connection. |
Container Name | Provide the ADLS container name. |
Path | Browse the path where the model is stored. |
If Model Source is selected as S3, then additional parameters will get displayed:
Connection | Select S3 connection name for creating the connection. |
Bucket Name | Select the S3 bucket name for creating the connection. |
Path | Browse the path where the model is stored. |
Once, the model is validated successfully, the Register button next to Validate will be enabled.
Click Register. After the model is registered successfully, you can view the model on the models listing page.
Note: As shown below, the user can view the details of the listing page of the created Model including details such as Name, Type, Parent Project (the project in which the Model is created), Scope (Workspace/Project), Owner, etc.
Once the model is registered it can be utilized for scoring over H2O processor in pipelines over Gathr.
The user can create a version of the registered Model by clicking at the CREATE VERSION button.
Note: To Validate as well as to Create Version ensure to use the below format for providing path:validate and create version we need to use below format for path: <foldername>/<model file name>
The user can Open, Enable Drift Detection, Deploy as Service, Export Model Version or Delete the created Model from the Actions column.
Under Actions tab user has an option to Export Model Version by providing the below inputs:
Field | Description |
---|---|
Export Source | Option to update the export of the source to upload version of the registered model. The available options are: - Desktop - HDFS - ADLS -DBFS - S3 |
Upon selecting HDFS as Export Source option, provide the below fields:
Connection | Provide the HDFS connection name. |
HDFS Path | Provide the HDFS path to upload the version of the registered model. |
Upon selecting ADLS as Export Source option, provide the below fields:
Connection | Provide the ADLS connection name. |
Container Name | Provide the ADLS container name. |
ADLS Path | Provide the ADLS Path for the registered model. |
Upon selecting DBDS as Export Source option, provide the below fields:
Path | Browse the path where the selected model should be stored. Note: The DBFS option will be available for azure environment. |
If Model Source is selected as S3, then additional parameters will get displayed:
Connection | Select S3 connection name for creating the connection. |
Bucket Name | Select the S3 bucket name for creating the connection. |
Path | Browse the path where the selected model should be stored. |
Note:
For Export Model Version use the format <foldername> to provide the path.
To register a H2O model, make sure the H2O server is running. To start the embedded H2O server, please refer the Installation guide.
If you have any feedback on Gathr documentation, please email us!