Let’s start with a quick overview of the confusion matrix. The confusion matrix is a table that is often used to describe the performance of a classification model. It summarizes the results of predictions made by the model, comparing them to the actual outcomes. Let’s consider a simple example of a binary classification problem, where we have two classes: positive (1) and negative (0). The confusion matrix for this problem may look like this:
Predicted Positive (1) | Predicted Negative (0) | |
---|---|---|
Actual Positive (1) | True Positive (TP) | False Negative (FN) |
Actual Negative (0) | False Positive (FP) | True Negative (TN) |
To measure the performance of the model, we can calculate several metrics based on the confusion matrix:
Accuracy: The proportion of correct predictions (both true positives and true negatives) out of the total predictions.
\(Accuracy = \frac{TP + TN}{TP + TN + FP + FN}\)
Precision: The proportion of true positive predictions out of all positive predictions made by the model.
\(Precision = \frac{TP}{TP + FP} = P(\text{Actually Positive | Predicted as Positive})\)
For instance, if the precision of a model is 0.8, it means that 80% of the instances predicted as positive are actually positive.
Recall (Sensitivity): The proportion of true positive predictions out of all actual positive instances.
\(Recall = \frac{TP}{TP + FN} = P(\text{Predicted as Positive | Actually Positive})\)
For instance, if the recall of a model is 0.9, it means that 90% of the actual positive instances are correctly predicted as positive. I may see it as “coverage” of the model, in a sense that it can correctly predict 90% of the actual positive instances.
F1 Score: The harmonic mean of precision and recall, which balances the two metrics.
\(\frac{1}{F1} = \frac{(1/Precision) + (1/Recall)}{2}\)
Specificity: The proportion of true negative predictions out of all actual negative instances.
\(Specificity = \frac{TN}{TN + FP} = P(\text{Predicted as Negative | Actually Negative})\)
For instance, if the specificity of a model is 0.85, it means that 85% of the actual negative instances are correctly predicted as negative.