Naive Bayes Algorithm

Naive Bayes are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features. Currently, it supports both multinomial Naive Bayes and Bernoulli Naive Bayes.

Naive Bayes Analytics processor is used to analyze data using ML’s NaiveBayesModel.

To use a Naïve Bayes Model in Data Pipeline, drag and drop the model component to the pipeline canvas and right-click on it to configure.

The Configuration Section → of every ML model is identical.

After the Configuration tab comes the Feature Selection → tab. (It is identical for all the models except K Means).

Once Feature Selection is done, perform Pre-Processing → on the data before feeding it to the Model. The configuration settings are identical for all the ML models.

Then configure the Model using Model Configuration.

Model Configuration

Label Column: Column name which will be treated as label column while training a model.

Probability Column: Column name which holds the probability value of the predicted output.

Feature Column: Column name which will be treated as feature column while training a model.

Prediction Column: Set the columns to be predicted. Value of Prediction Column must be set as “prediction” in order to deploy the model as REST service.

Model Type: Model Type for Naïve Bayes Classifier default is multinomial.

Other model type supported is Bernoulli.

Thresholds: Specify the threshold parameter for class range. Number of thresholds should be equal to Number of Output Classes.

Smoothing: Smoothing param to be used for training Naive Bayes Classifier. Default value is 1.0.

After Model Configuration, Post-Processing → is done, Model Evaluation → can be performed.

Then, apply the Hyper Parameters → on the model to enable tuning your configuration; after which you can simply add notes and save the Configuration.

Top