Understanding Cyberattack using Confusion Matrix

What is a Cyber Attack?

A cyber attack is an assault launched by cybercriminals using one or more computers against a single or multiple computers or networks. A cyber attack can maliciously disable computers, steal data, or use a breached computer as a launch point for other attacks. Cybercriminals use a variety of methods to launch a cyber attack, including malware, phishing, ransomware, denial of service, among other methods.

Some Cyber attacks in the News:-

SolarWinds Sunburst Attack:-

The world is now facing what seems to be a 5th generation cyber-attack – a sophisticated, multi-vector attack with clear characteristics of the cyber pandemic. Named Sunburst by researchers, we believe this is one of the most sophisticated and severe attacks ever seen. The attack has been reported to impact major US government offices as well as many private sector organizations.

This series of attacks were made possible when hackers were able to embed a backdoor into SolarWinds software updates. Over 18,000 companies and government offices downloaded what seemed to be a regular software update on their computers, but was actually a Trojan horse. By leveraging a common IT practice of software updates, the attackers utilized the backdoor to compromise the organization’s assets enabling them to spy on the organization and access its data. For more information visit our Sunburst attack hub.

Ransomware Attacks:-

The resurgence of ransomware has been growing. Small local and state government agencies, mainly in the southeastern part of the U.S., have been victimized. Digital transformation is eroding traditional network perimeters with the adoption of cloud computing, cloud-based subscription services, and the ubiquity of mobile devices. This increased expansion of vectors means more ways to attack an organization.

In Q3 2020 CheckPoint Research saw a 50% increase in the daily average of ransomware attacks, compared to the first half of the year, Organizations worldwide were under a massive wave of ransomware attacks, with healthcare as the most targeted industry As these attacks continue to mature both in frequency and intensity, their impact on business has grown exponentially. The Top ransomware types were Maze and Ryuk .

What is Confusion Matrix:-

The confusion matrix is a matrix used to determine the performance of the classification models for a given set of test data. It can only be determined if the true values for test data are known. The matrix itself can be easily understood, but the related terminologies may be confusing. Since it shows the errors in the model performance in the form of a matrix, hence also known as an error matrix. Some features of the Confusion matrix are given below:

For the 2 prediction classes of classifiers, the matrix is of 22 table, for 3 classes, it is 33 table, and so on.
The matrix is divided into two dimensions, which are predicted values and actual values along with the total number of predictions.
Predicted values are those values, which are predicted by the model, and actual values are the true values for the given observations.
It looks like the below table: The above table has the following cases:
True Negative: Model has given prediction No, and the real or actual value was also No.
True Positive: The model has predicted yes, and the actual value was also true.
False Negative: The model has predicted no, but the actual value was Yes, it is also called a Type-II error (A type II error is also known as a false negative and occurs when a researcher fails to reject a null hypothesis which is really false. Here a researcher concludes there is not a significant effect, when actually there really is).
False Positive: The model has predicted Yes, but the actual value was No. It is also called a Type-I error (A type 1 error is also known as a false positive and occurs when a researcher incorrectly rejects a true null hypothesis. This means that you report that your findings are significant when in fact they have occurred by chance).

During a Cyber Attack:-

For example, we have a Dataset of different types of attacks:

Data normalization:- Before training the model, we used the Min-max normalization to normalize our data. Normalization scales the data according to certain rules and ﬁnally makes them ﬁt a speciﬁc interval. It can also improve the convergence speed and accuracy of the model. The transformation function is as:-

Where x(ij) is the value of the jth records of the ith feature. After normalization, the range of numbers for all our vectors is within the range [0,1].

Evaluating Metrics:- The ﬁnal classiﬁcation results are divided into four states: TP(true positive), FP(false positive), TN (true negative), FN (false negative), they also are four basic metrics of the confusion matrix. TP is the number of samples that are classiﬁed in the normal class. FP is the number of attack samples that are incorrectly classiﬁed in the normal class. TN is the number of attack samples that are classiﬁed correctly. FN is the number of normal class samples that are classiﬁed in the attack class. To evaluate the performance of the proposed method. Four states, the accuracy, precision, detection rate, false alarm rate and F-measure are deﬁned as:

Reference:- ResearchGate