Classification in Machine Learning

5 min readAug 22, 2021

What is classification ?

A classification algorithm is a Supervised Learning approach that uses training data to identify the category of fresh observations.A program learns from a dataset or observations and then classifies fresh observations into one of many categories Such as, Yes or No, 0 or 1, Spam or Not Spam. Classification algorithm is a Supervised learning technique, hence it takes labeled input data, which means it contains input with the corresponding output.

The main goal of the Classification algorithm is to identify the category of a given dataset, and these algorithms are mainly used to predict the output for the categorical data.

There are two types of Classifications:

Binary Classifier: If the classification problem has only two possible output
Examples: SPAM or NOT SPAM, CAT or DOG, etc.
Multi-class Classifier: If a classification problem has more than two outputs
Example: Classifications of types of dog breeds

There are two types of learners:

Lazy Learners: Lazy Learner firstly stores the training dataset and wait until it receives the test dataset. In Lazy learner case, classification is done on the basis of the most related data stored in the training dataset. It takes less time in training but more time for predictions.
Example: K-NN algorithm
Eager Learners: Eager Learners develop a classification model based on a training dataset before receiving a test dataset. Eager Learner takes more time in learning, and less time in prediction.
Example: Decision Trees, Naïve Bayes, ANN.

How to evaluate classification models:

1 . Cross-Entropy Loss: (ylog(p)+(1?y)log(1?p)) where y is the actual output and p is the predicted output

It is used for evaluating the performance of a classifier, whose output is a probability value between the 0 and 1. For a good binary Classification model, the value of log loss should be near to 0. The lower log loss represents the higher accuracy of the model.

2 . Confusion Matrix:

The confusion matrix provides us a matrix/table as output and describes the performance of the model. It is also known as the error matrix

sentiment analysis in classification

A sentiment analysis task is usually modeled as a classification problem, whereby a classifier is fed a text and returns a category, e.g. positive, negative

The process of detecting positive or negative sentiment in text is known as sentiment analysis. Businesses frequently utilize it to detect sentiment in social data, assess brand reputation, and gain a better understanding of their customers.

In social media monitoring, sentiment analysis is used to obtain insights into how consumers feel about specific subjects and to discover important concerns in real time before they spiral out of hand.