Confusion Matrix and Cyber Security

A confusion matrix provides a summary of the predictive leads to a classification problem. Correct and incorrect predictions are summarized during a table with their values and weakened by each class.

Definition of the Terms:
True Positive: You predicted positive and it’s true.

True Negative: You predicted negative and it’s true.

False Positive: You predicted positive and it’s false.

False Negative: You predicted negative and it’s false.

For Example, let’s take a scenario

We have a total 10 of car and bus and our model predicts whether it is a car or not

Actual values = [‘bus’, ‘car’, ‘bus’, ‘car’, ‘bus’, ‘bus’, ‘car’, ‘bus’, ‘car’, ‘bus’]
Predicted values = [‘bus’, ‘bus’, ‘bus’, ‘car’, ‘bus’, ‘bus’, ‘car’, ‘car’, ‘car’, ‘car’]

Definition of the Terms:
True Positive: You predicted positive and it’s true. You predicted that a car and it actually is.

True Negative: You predicted negative and it’s true. You predicted that is not a bus and it actually is not.

False Positive (Type 1 Error): You predicted positive and it’s false. You predicted a car but it actually is not.

False Negative (Type 2 Error): You predicted negative and it’s false. You predicted that is a car but it actually is.

Classification Accuracy is given by the relation:

1. ACCURACY:

Accuracy is the number of correctly (True) predicted results out of the total.

Accuracy = (TP + TN) / (TP + TN + FP + FN)

= (4 + 3) / 9 = 0.77

Accuracy should be considered when TP and TN are more important and the dataset is balanced because in that case the model will not get baised based on the class distribution. But in real-life classification problem, imbalanced class distribution exists.

2. Precision:
Precision is defined as the ratio of the total number of correctly classified positive classes divided by the total number of predicted positive classes. Or, out of all the predictive positive classes, how much we predicted correctly. Precision should be high.

Out of the total predicted positive values, how many were actually positivePrecision = TP / (TP + FP) = 4/5 = 0.8

3. RECALL:

Out of the total actual positive values, how many were correctly predicted as positive

Recall= TP / (TP + FN) = 4/5 = 0.8

Based on the problem statement, whenever the FP is having a greater impact, go for Precision and whenever the FN is important, go for Recall

4. F beta SCORE

In some use cases, both precision and recall are important. Also, in some use cases even though precision plays an important role or recall plays is important, we should combine both to get the most accurate result.

Contribution of Confusion Matrix in Cyber Security:

Cybersecurity is the practice of defending computers, servers, mobile devices, electronic systems, networks, and data from malicious attacks. It’s also known as information technology security or electronic information security. The term applies in a variety of contexts, from business to mobile computing, and can be divided into a few common categories.

· Network security

· Application security

· Information security

· Operational security

· Disaster recovery and business continuity

· End-user education

The Detection of Attack and Normal Pattern Can be Generalized as Follows

True Positive (TP): The amount of attack detected when it is actually attacked.

True Negative (TN): The amount of normal detected when it is actually normal.

False Positive (FP): The amount of attack detected when it is actually normal (False alarm).

False Negative (FN): The amount of normal detected when it is actually attacked.

Comparison of detection rate: Detection Rate (DR) is given by.

Comparison of False Alarm Rate: False Alarm Rate (FAR) refers to the proportion that normal data is falsely detected as attack behavior.

Confusion matrix contains information on actual and predicted classifications done by a classifier. The performance of a cyber-attack detection system is commonly evaluated using the data in a matrix.

⭐Hope you enjoyed the article.⭐

Keep Learning !! Keep Sharing !!

--

--

--

ARTH-LEARNER

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Understanding Public Sentiment on Nuclear Energy with Twitter

T2DM PREDICTION USING ML

Evaluations on a Couple of Best Mattresses https://t.co/LJLEjhKuzg

Meet Sri Lankan Researcher — Madhushi Bandara

Top Microsoft Power BI Features as a Business Intelligence Tool

Implementing an efficient generalised Kernel Perceptron in PyTorch

Machine Learning basics with Frequent Pattern-Growth Algorithm

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
suyog shinde

suyog shinde

ARTH-LEARNER

More from Medium

Predicting & Preventing Cybercrime in a Post COVID-19 World Using Big Data & AI — BrightTalk

[Image] Predicting & Preventing Cybercrime in a Post COVID-19 World Using Big Data & AI at BrightTalk

A Simple Machine Learning step by step tutorial: Using Amazon Translate + Lambda + API Gateway &…

Data Version Control from Zero to One

Image Matching with Shopee