Assignment7
a. What is/are the requirement(s) of LDA?
A few requirements of LDA are 1. Homogeneity of variance or homoscedascity 2. Features are Normal distribution 3. There is a linear relationship between the feature and the dependent
b. How LDA is different from Logistic Regression?
Linear Discriminant Analysis is similar to Principal Component Analysis in that it finds a discriminatory space between the classes being observed. While PCA transforms the original dataset into a set of orthogonal features called principal components by identifying the data’s direction with the maximum variance, LDA directly attempts to find a space where there is maximum separation between the classes. LDA is similar to logistic regression also in that both of them are classification techniques. However, the main difference between them maybe that logistic regression observes how each of the features independently account for the variance in the outcome variable, LDA provides the separability between the classes when all the features are combined or based on the combination of the features.
c. What is ROC?
A ROC curve is a concise visual method to summarize the results of confusion matrix for many thresholds and identify the optimal threshold value for classification. For binary classification, ROC curve is a 2-dimentional plot. The y-axis on the graph captures the sensitivity and the x-axis captures the complement of specificity (1-Specificity).
d. What is sensitivity and specificity? Which is more important in your opinion?
If participants who have medical or other statistical conditions are considered positive and participants who do not have such conditions are considered positive, sensitivity refers to a model’s ability to accurately assign participants with condition as positive. Another word for sensitivity is True Positive Rate.
Whereas specificity refers to the model’s abilities to accurately assign participants without condition as negative. Another term for specificity is True Negative Rate. We use the complement of specificity to compute ROC and the false Positive rate or 1 – Specificity.
e. From the following chart, for the purpose of prediction, which is more critical?
Ideally, we want a model that correctly classifies all the positive scenarios as positives and designates all the other scenarios as negative. Such curve will have 100% specificity and 100% sensitivity. However, hoping to find a model and threshold such as this is not realistic. Generally, the threshold value corresponding to the point with the maximum sensitivity but with the smallest 1- specificity on the ROC curve will be the most optimal threshold value, since this threshold value results in the maximum true classification with the least false classification. There is usually a trade-off between sensitivity and specificity, and in such scenarios, we choose the threshold value depending on our need and urgency. If correctly identifying positives is the most critical task, we choose the threshold that has the maximum sensitivity score. If correctly identifying negative is more important, we put more emphasis on getting the threshold that has maximum specificity. Therefore, just looking at the chart, we cannot say one is more important than the other.
3. Calculate the prediction error from the following:
We would calculate prediction accuracy as the ratio of total correct by the model (9725) by total number of observation (10000). This model has 97.25 % accuracy, which implies that the error rate is 2.75%.