Register Scikit Model

Gathr provides its users with the option to register Scikit model that is trained outside of Gathr. After the successful registration, the user can use the model for making predictions.

To register a new model, click the + icon on the right side of the screen.

In the Register Model window, mention the following properties:

FieldDescription
NameProvide a unique name of the Model.
Model APIChoose Scikit as API.
Model Category

Scikit provides a range of ML algorithms. Select one of the below options available:

- Classification

- Clustering

- Pipeline

-Regression

Feature ListList of features to train the model. Specify the features either by entering the names manually or by uploading the .csv file.
ScopeSelect the scope/visibility of the model either at project or workspace level.
Model SourceRegister the Scikit model either by uploading the zip file or by selecting the HDFS, DBFS, ADLS, S3.

If Model Source is selected as HDFS, then additional parameters will get displayed:

Connection NameChoose the HDFS connection name.
Override CredentialOption to override credentials for HDFS connection.
HDFS PathProvide the path where the model is located on HDFS.
ValidateValidates the uploaded model or located at the given HDFS location.

If Model Source is selected as DBFS, then additional parameters will get displayed:

Path

Browse the path where the model is stored.

Note: The DBFS option will be available for azure environment.

If Model Source is selected as ADLS, then additional parameters will get displayed:

Connection NameSelect the ADLS connection name for creating the connection.
Container NameProvide the ADLS container name.
PathBrowse the path where the model is stored.

If Model Source is selected as S3, then additional parameters will get displayed:

ConnectionSelect S3 connection name for creating the connection.
Bucket NameSelect the S3 bucket name for creating the connection.
PathBrowse the path where the model is stored.

Click the Validate button. Once, the model is validated successfully, the Register button next to Validate will be enabled. Click Register. After the model is registered successfully, you can view the model on the models page. Now you can use the registered model in the data pipeline for making predictions.

Note: User can use the registered scikit model for doing predictions by selecting the scikit processor to create a Data Pipeline.

Top