Python Configuration

Gathr supports Python 3. To support Python 3, configure as mentioned below:

Python 3 (preferred 3.8.8) should be installed on all nodes of the cluster.

Python 3 should be the default python version on all the nodes.

Install Python 3 with the root user and create a soft link of the Python 3 binaries in /usr/bin, so that it will be available for all users.

Python 3 should be accessible with command, ‘python3.

Python_Configuration

All the required libraries should be installed for both python versions (not mandatory).

Python Libraries for Registering Models

To register H2O, Scikit, or Spark ML models, and use them in scoring pipelines, follow the below steps:

For Python 3

  1. Install python 3.8.8

  2. Install pip3

  3. Install H2O using the below command:

    pip3 install h2o==3.28.1.3
    
  4. Install the scikit dependencies using the below command:

    root> pip3 install scikit-learn==0.24.1
    root> pip3 install numpy==1.19.2
    root> pip3 install pandas==1.2.4
    root> pip3 install matplotlib==3.4.2
    root> pip3 install mlflow==1.0.0
    root> pip3 install hdfs==2.5.8
    root> pip3 install scipy==1.6.2
    
Top