Quick Contact


    Types of ML Classification Algorithms:

    Classification Algorithms can be further divided into the mainly two category:

    Types of ML Classification Algorithms
    Linear regression model:

    It is a supervised learning algorithm which predicts the outcome and outcome has be to continuous and constant slope. It also predict the values that are in within continuous range like amount instead of classifying them into two different categories. Further, linear regression model consists of two common algorithms as shown in figure.

    Types of ML Classification Algorithms

    Let’s understand the two common linear regression model algorithms in detail are as follows:

    Logistic Regression:

    : It is one of the most common supervised learning algorithm and that it used for classification problems. Its method is based on the Maximum Likelihood estimation. It can predict the categorical dependent variable using independent variables and produce outcome between 0 and 1 via activation function, passing weighted inputs sum. Sometimes, activation function, called sigmoid function; while an obtained curve is known as sigmoid curve. It may also require to find probabilities between two different classes. Like, it might rain today and to calculate this estimation, the dataset should be error free.

    Logistic regression equation:

    Linear equation: a =b0+ b1 X1+ b2 X2+⋯…….+bn Xn

    Sigmoid function: S(a)=1⁄((1+e-a)

    Where, S (a) = sigmoid function

    e = Euler’s number

    Replace a in the sigmoid function with the linear equation:

    logit(S):ln⁡(S⁄((1-S) ))= b0+ b1 X1+ b2 X2+⋯……..bn Xn

    Now, let’s understand with the help of an example, how to implement logistic regression
    using by jypter library of Python:

    Step 1: Import python modules

    from matplotlib import pyplot as plt

    from sklearn.datasets import make_classification

    from sklearn.linear_model import LogisticRegression

    from sklearn.model_selection import train_test_split

    from sklearn.metrics import confusion_matrix

    import pandas as pd

    Firstly, it is compulsory to import libraries; where, “Matplotlib” library will help to analysis process in the data manipulation and sklearn library imports the text data and stores in the input data variable. However, “make classification” is used to generate dataset which is present in sklearn.datasets . Also, “LogisticRegression” is imported from sklearn.linear_model to perform a model and train_test_split: is imported from sklearn.model_selection to split dataset into training and test datasets. The confusion matrix is also imported from sklearn.metrics to produce the confusion matrix of the classifiers and Pandas is used for managing the datasets.

    Step 2: Input dataset

    x, y = make_classification(

    n_samples=100,

    n_features=1,

    n_classes=2,

    n_clusters_per_class=1,

    flip_y=0.03,

    n_informative=1,

    n_redundant=0,

    )

    In the above program, two variables have been taken, namely dependent variable(y) and Independent variable(x). Now, to produce the dataset using the “make_classification” function. It also specify the number of samples, the number of feature, number of classes and other parameters.

    Step 3: Calculate the dataset

    plt.scatter(x, y, c=y, cmap=’rainbow’)

    plt.title(‘Scatter Plot of Logistic Regression’)

    plt.show()

    Thus, the values of the different variables can be calculated once the model is created using the fit function. Now, to plot values in scatter way, scatter plots the values of existing data will fit on logistic regression line across.

    Step 4: Visualise the dataset

    x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=1)

    log_reg = LogisticRegression()

    log_reg.fit(x_train, y_train)

    Now, “training dataset” is used to train the model. While, “test dataset” is used to test the model’s performance based on the new data.

    Types of ML Classification Algorithms
    Output of Logistics Regression

    Output:

    LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,

    intercept_scaling=1, max_iter=100, multi_class=’ovr’, n_jobs=1,

    penalty=’l2′, random_state=None, solver=’liblinear’, tol=0.0001,

    verbose=0, warm_start=False)

  • Support Vector Machines (SVM):

    It is a common, supervised Learning algorithms; used for both classification and regression problems. However, at a primary stage, used for classification problems. The aim of this algorithm is to develop a better line or decision boundary. The best decision boundary also known as a hyperplane and high vectors or points to create it. The aim of SVM, to find hyperplane; which separate two objects and classes as shown in figure.

    Types of ML Classification Algorithms

    Suppose, let us consider two hyperplanes to check the margins shown by X1 and X2. Where, X1 > X2 margins, the hyperplane that divides the good one between the green and blue planes in a new plan.

    Types of ML Classification Algorithms

  • However, high vectors are known as support vectors and hence it is termed as support vector machine. It also divide n-dimensional space into different classes. So that, in future, new data point can be in the accurate category.

    Now, let’s understand with the help of an example, how to implement support vector machine(SVM) using by jypter library of Python:

    Step 1:Import python modules

    import numpy as np

    import matplotlib.pyplot as plt

    from sklearn.datasets.samples_generator import make_blobs

    Firstly, import all the python modules. In support vector machine; has a function called “make_blobs”. Where, this function; is a part of arn.datasets.samples_generator. In package, all the methods generates the data samples or datasets. Thus, scikit-learn makes the datasets. Further, which are used to calculate the efficiency of the models.

    Step 2: Input dataset

    X, Y = make_blobs(n_samples=350, centers=2, random_state=0, cluster_std=0.20)

    The function “make_blobs” helps to generate the datasets.

    Step 3: Calculate the dataset

    plt.scatter(X[:, 0], X[:, 1], c=Y, s=50, cmap=’winter’);

    plt.show()

    xfit = np.linspace(-1, 3.5)

    for m, b, d in [(1, 0.55, 0.30), (0.2, 1.8, 0.25), (-0.1, 2.6, 0.1)]:

    yfit = m * xfit + b

    plt.plot(xfit, yfit, ‘-k’)

    plt.fill_between(xfit, yfit – d, yfit + d, edgecolor=’none’,

    color=’#AAAAAA’, alpha=0.4)

    plt.xlim(-1, 3.5);

    plt.scatter(X[:, 0], X[:, 1], c=Y, s=50, cmap=’winter’)

    plt.show()

    The values of the different variables can be calculated once the model is created using the fit function. Now, to plot values in scatter way, scatter plots the values of existing data will fit around the on linear regression line.

    Types of ML Classification Algorithms

    Now, to check the accuracy of the model, following code is given below and an output
    of accuracy is 0.9875

    from sklearn.linear_model import LinearRegression

    regressor = LinearRegression().fit(X, Y)

    from sklearn.metrics import r2_score

    print(r2_score(regressor.predict(X), Y))

  • Non-linear regression models:

    Simply when a function that is not; called as Non-linear. It has high degree of polynomials which are nonlinear. Besides, sin or cos; trigonometric functions are nonlinear. However, square roots are also nonlinear. Now, let’s understand with the help of an example, how to implement non- linear regression using by jypter library of Python. Thus, the explanation of non- linear method is similar to linear models.

    Step 1: Import python modules

    To import python modules use:

    Import numpy as np

    import matplotlib.pyplot as plt

    from sklearn.linear_model import LinearRegression

    Step 2: To enter dataset

    X = np.random.randn(120,1)

    Y = np.random.uniform(-10,10,(120,))

    Step 3: Calculate the dataset

    X = np.hstack((X, X*X))

    Z = (3*X[:,1] + Y)

    Step 4: Visualise the dataset

    plt.scatter(X[:, 0], Z)

    plt.show()

    plt.scatter(X[:, 1], Z)

    plt.show()

    Types of ML Classification Algorithms
    Output: Nonlinear Regression

    Now, to check the accuracy of the model, following code is given below and an output of
    accuracy is 55.3383.

    from sklearn.linear_model import LinearRegression

    regressor = LinearRegression().fit(X, Y)

    from sklearn.metrics import r2_score

    print(r2_score(regressor.predict(X), Y))

    Further, non-linear regression model consists of four common algorithms as shown in figure.

    Types of ML Classification Algorithms

    Let’s understand the four common non-linear regression model algorithms in detail are as follows:

    • K-Nearest Neighbours:

      It is useful for classification problems. It evaluate the distance between an input and test data and then give outcome. Sometime, it predicts on the basis of similarity concept between the raw data and new data. Further, it divides the present case and new case into different category. In such cases, new data can be easily identified and classified into suitable category. It is also a non-parametric algorithm, which do not make any assumption basis on the underlying data. As, kNN evaluate the distance between two different data points. To calculate , Euclidean Distance formula equation is:

      Where, d(a,b) = √∑ni=1(bi-ai)2

      a , b are two different points in Euclidean.

      bi and ai , are two different euclidean vectors which starts from the initial point.

      n , defines the n-space in Euclidean.

      Now, let’s illustrate an example. In the given figure, plotted the training data set and
      consists of predictions namely, six blue and six orange. The aim, classify the data point marked with a black cross (x).

      The steps are as follows:

    1. Select the value of K = 3
    2. Select the closest observation around the cross as shown in below figure. Where, there are two blue dots and one orange dot around it.
    3. 3. Calculate the probability for each class as shown :
    4. P(blue class | observation) = 2/3

      P(orange class | observation) = 1/3

    5. In the given figure, since; blue class has the maximum probability. Therefore, classify the black cross to the blue class belonging.
    6. Repeat the process, until all the data points are classified.
    Types of ML Classification Algorithms

    Now, let’s understand with the help of an example, how to implement KNN using by jypter library of Python.

    Step 1: Import the python modules

    import numpy as np

    import matplotlib.pyplot as plt

    from sklearn.datasets.samples_generator import make_blobs

    from sklearn.neighbors import KNeighborsClassifier

    from sklearn.model_selection import train_test_split

    a,b = make_blobs(n_features=2, centers=1)

    The “make_blobs” function; is a part of sklearn.datasets.samples_generator to generate data. In package, all the methods generates the data samples or datasets. Thus, scikit-learn makes the datasets. Further, by default, the KNeighborsClassifier algorithm search for the closest neighbours. Now, “training dataset” is used to train the model. While, “test dataset” is used to test the model’s performance based on the new data.

    Step 2: Input data and Visualise

    plt.figure()

    plt.scatter(a[:, 0], a[:, 1], c=b)

    plt.savefig(‘centers_1.png’)

    plt.title(‘centers = 1’)

    plt.show()

    Now, to plot values in scatter way, scatter plots the values of existing data.

    Types of ML Classification Algorithms
    Output: KNN
    • Decision tree:

      : It builds both classification or regression models, in tree structure. It divide the dataset into small and small subsets. Therefore, at the same time it is associated with root (decision node) and keeps on growing. The final output consists of a tree with decision nodes and leaf nodes. Where, a decision node consists of more than two branches and a leaf node consist of classification or decision. The root node also known as a decision node because it corresponds to the better predictor. It can also operate both categorical and numerical data. The common , terminology of a decision tree are as follows:

      • Root node:

        Determine all the population or sample. Further, it divides more than two homogeneous sets.

      • Splitting:

        Divide a node into two or more sub-nodes.

      • Decision node:

        Divide into another two sub-nodes.

      • Leaf node:

        In this, nodes does not splits.

      • Pruning:

        When, sub-nodes are removed from a decision node.

      • Sub-Tree:

        Branch of a decision tree.

      • Parent and child node:

        When a node, gets divided into sub-nodes is called a parent node. Whereas, sub-nodes are the child of a parent node.

      • Types of ML Classification Algorithms
    • Naïve Bayes:

      It is an easy and most effective classifier which can increase the processing speed of machine learning models and can predict faster. This algorithm is based on bayes theorem; to solve classification problems. Generally, it is used for text classification which consist of high-dimensional training dataset. It is also known as probabilistic classifier; which means, predicts object on the basis of the probability. For instance, spam filtration, sentimental analysis and classifying articles can be predicted easily.

      Naïve Bayes equation: P(AB) = P(BA) P(A)/ P(B)

      Where,

      P(A|B) : Probability of hypothesis A is predicted on the basis of event B.

      P(B|A) : If probability of a hypothesis is true, then evidence of Probability is given .

      P(A) : Predict the evidence before the probability of hypothesis.

      P(B) : Predict the evidence of Probability.

      Therefore, there are three different types of distributions are present in Naive Bayes and often, implementation is known after the distribution.

    • Binomial Naive Bayes:

      It uses a binomial distribution.

    • Multinomial Naive Bayes:

      It uses a multinomial distribution.

    • Gaussian Naive Bayes:

      It uses a Gaussian distribution.

    When, a dataset gets mixed with other data types for the input variables then it may
    need to select the different types of data distribution for all the variables. It is not
    compulsory to use all the distributions are. This algorithm has been proven accurate
    and useful for text classification tasks. For instance, in a word document, it consists
    of binary numbers , count or frequency (tf/idf) input vectors. Where, binomial,
    multinomial or Gaussian probability distributions are respectively used.

    • Random forest:

      It is a learning group concept for classes. By several decision trees, both regression and classification are constructed by it during training time. It also determine suitable way for practice sets by overfitting their way.

      Now, let’s understand with the help of an example, how to implement Random Forest using by jypter library of Python.

    • Step 1: Import module and dataset

      from sklearn import datasets

      iris = datasets.load_iris()

      print(iris.target_names)

      print(iris.feature_names)

      print(iris.data[0:5])

      print(iris.target)

      To build a model in random forest, use ‘load_iris()’ function . It is an in-built function in
      sklearn. It consists of sepal (length and width) also petal (length and width) and other
      type of flowers too. The flower is divided into three classes such as setosa, versicolor,
      and Virginia. To print, the target and feature names; make sure that to you have correct
      dataset. Further, the first five rows of the dataset will gets printed and also the target
      variable for the whole dataset.

      Step 2: Create a dataframe

      import pandas as pd

      data=pd.DataFrame({

      ‘sepal length’:iris.data[:,0],

      ‘sepal width’:iris.data[:,1],

      ‘petal length’:iris.data[:,2],

      ‘petal width’:iris.data[:,3],

      ‘species’:iris.target

      })

      data.head()

      DataFrame is defined as a two-dimensional labelled data structure which consists of
      columns and other potentially types.

      Step 3: Split the dataset

      from sklearn.model_selection import train_test_split

      X=data[[‘sepal length’, ‘sepal width’, ‘petal length’, ‘petal width’]]

      y=data[‘species’]

      X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

      To split the columns into dependent variable (Y) and independent variables (X) by using
      training and test set.

      Step 4: Train the model

      from sklearn.ensemble import RandomForestClassifier

      clf=RandomForestClassifier(n_estimators=100)

      clf.fit(X_train,y_train)

      y_pred=clf.predict(X_test)

      After splitting, train the model based on the training set and predict the performance
      on the test dataset.

      Step 5: Check the accuracy

      from sklearn import metrics

      print(“Accuracy:”,metrics.accuracy_score(y_test, y_pred))

      Output: 0.9777

      Step 6: Predict type of flower

      clf.predict([[3, 5, 4, 2]])

      Step 7: Create a random forests model

      from sklearn.ensemble import RandomForestClassifier

      clf=RandomForestClassifier(n_estimators=100)

      Step 8: To see variable score

      import pandas as pd

      feature_imp =

      pd.Series(clf.feature_importances_,index=iris.feature_names).sort_values(ascending=False)
      feature_imp

      Step 9: Visualise the dataset

      import matplotlib.pyplot as plt

      import seaborn as sns

      %matplotlib inline

      # Creating a bar plot

      sns.barplot(x=feature_imp, y=feature_imp.index)

      # Add labels to your graph

      plt.xlabel(‘Feature Importance Score’)

      plt.ylabel(‘Features’)

      plt.title(“Visualizing Important Features”)

      plt.legend()

      plt.show()

      For visualization process, combine matplotlib and seaborn because Matplotlib is a
      superset of seaborn and seaborn is built on the top of matplotlib library . It also provide
      several customized themes and extra plot types.

      Types of ML Classification Algorithms
      Output: RandomForest Classifier
      Applications of supervised learning

      Let’s understand some common applications of supervised learning are:

  • Bioinformatics:

    Nowadays, it is one of the common applications of supervised learning. It is responsible for storing biological data of like fingerprints, iris texture and so on. Today, smartphones have ability to learn biological data and also provide security for the system. For example, Google Pixel, iPhones, Samsung, OnePlus and etc..Has the ability to provide feature like facial and finger recognition.

  • Speech Recognition:
  • This type of application is capable to identify the voice. For example, Google Assistant and Siri. Where, supervised learning algorithms help to maintain security and communication between virtual assistants and customers.
  • Spam Detection:

    It is an unauthorised computer based messages which can used to harm someone’s personal system. It can be present in any form like in email’s, mobile phones and etc… However, various spam emails are consists of commercial emails but inside which might contain fake links and directly which can be connected to attack the system or malware-websites. Thus, to detect spam emails, supervised learning algorithm can be used as it’s consist of several emails dataset which are assign as spam or not spam. When a new dataset is provided without any labels, an outcome can be calculated and reduce spams from several media.

  • Medical:

    In this field, a supervised learning algorithms, can predict if a person has a disease or not. For example, by loading the reports dataset in the algorithms model then model trains itself and will predict whether a person is healthy or suffering from any disease.

  • Copyright 1999- Ducat Creative, All rights reserved.