In all of these posts, we used a simple contingency table to determine the accuracy of the model. However, accuracy is only one of a number of different ways to determine the "goodness" of a model. Now, we need to expand our evaluation to include the Evaluate Model module. Specifically, we'll be looking at the ROC (Receiver Operating Characteristic) tab. Let's start by refreshing our memory on the data set.
|Adult Census Income Binary Classification Dataset (Visualize)|
|Adult Census Income Binary Classification Dataset (Visualize) (Income)|
Utilizing some of the techniques we learned in the previous posts, we'll start by using the "Tune Model Hyperparameters" module to select the best sets of parameters for each of the four models we're considering.
|Experiment (Tune Model Hyperparameters)|
|Tune Model Hyperparameters|
At this point, we have two options. For simplicity's sake, we could simply train the models using the entire data set, then score those same records. However, that's considered bad practice as it encourages Overfitting. So, let's use the "Split Data" module to Train our models with 70% of the dataset and use the remaining 30% for our evaluation.
|Experiment (Score Model)|
|Experiment (Evaluate Model)|
|ROC Experiment View|
As an added note, there is a grey diagonal line that goes across this chart. That's the "Random Guess" line. It follows a line for 50% probability of guessing correctly, just like if we flipped a coin. If we find that our model dips below that line, then that means our model is worse than random guessing. In that case, we should seriously reconsider a different model. If the model is always significantly below that line, then we can simply swap our predictions (True becomes False, False becomes True) to create a good model.
|Threshold and Evaluation Tables|
|ROC Curve 2|
|ROC Curve (Final)|