Adult Census Income Binary Classification Dataset (Visualize) |

Adult Census Income Binary Classification Dataset (Visualize) (Income) |

Experiment |

Threshold and Evaluation Tables |

Threshold Table |

Scores |

Threshold Table |

Also, there's another metric over to the side called "AUC". This stands for "Area Under the Curve". For those of you that read our ROC post, you may remember that our goal was to find the chart that was closer to the top left. Mathematically, this is equivalent to finding the curve that represents the largest area. Let's take a look at a couple of different thresholds.

Threshold Table (T = .41) |

Threshold Table (T = .61) |

As you can imagine, this table is good for looking at individual thresholds, but it can get time-consuming quickly if you are trying to find the optimal threshold. Fortunately, there's another way, let's take a look at the "Evaluation" Table.

Evaluation Table |

Discretized Results |

Discretized Results (Augmented) |

Evaluation Table |

Threshold Results |

equal to the value at the bottom of the threshold defined in the "Score Bin" column. To clarify this, we've added a column called "Threshold" to showcase what's actually being calculated. To verify, let's compare the Threshold table to the Threshold Results for a threshold of 0.

Threshold Table (T = 0) |

Threshold Results |

Let's talk about what these columns represent. We've already talked about the "Accuracy", "Precision", "Recall" and "F1 Score" columns. The "Negative Precision" and "Negative Recall" are similar to "Precision" and "Recall", except that they are looking for Negative values (Income = "<=50k") instead of Positive values (Income = ">50k"). Therefore, we're looking maximize all six of these values.

The "Fraction Above Threshold" column tells us what percentage of our records have Scored Probabilities greater than the Threshold value. Obviously, all of the records have a Scored Probability greater than 0. However, it is interesting to note that 50% of our values have Scored Probabilities between 0 and .1. This isn't surprising because, as we mentioned in a previous post, the Income column is an "Imbalanced Class". This means that the values within the column are not evenly distributed.

The "Cumulative AUC" column is a bit more complicated. Earlier, we talked about "AUC" being the area under the ROC curve. Let's take a look at how a ROC curve is actually calculated.

The ROC Curve is calculated by selecting many different threshold values, calculating the True Positive Rate and False Positive Rate, then plotting all of the points as a curve. In the above illustration, we show how you might use different threshold values to define points on the red curve. It's important to note that our threshold lines above were not calculated, they were simply drawn to illustrate the methodology. Looking at this, it's much easier to understand "Cumulative AUC".

Simply put, "Cumulative AUC" is the area under the curve up to a particular threshold value.

This opens up another interesting option. In the previous posts (ROC and Precision/Recall and Lift), we evaluated multiple models by comparing them in pairs and comparing the winners. Using the Evaluation Tables, we could compare all of these values simultaneously in Excel. For instance, we could use Cumulative AUC to compare all four models at one.

Using this visualization, we can see that the Boosted Decision Tree is the best algorithm for anything thresholds greater than .3 (30%) or less than .1 (10%). If we wanted to utilize a threshold within these values, we would be better off using the Averaged Perceptron or Logistic Regression algorithms. Hopefully, we've sparked your imagination to explore all the capabilities that Azure ML has to offer. It truly is a great tool that's opening the world of Data Science beyond just the hands of Data Scientists. Thanks for reading. We hope you found this informative.

Brad Llewellyn

Data Scientist

Valorem

@BreakingBI

www.linkedin.com/in/bradllewellyn

llewellyn.wb@gmail.com

The "Cumulative AUC" column is a bit more complicated. Earlier, we talked about "AUC" being the area under the ROC curve. Let's take a look at how a ROC curve is actually calculated.

ROC Curve |

Simply put, "Cumulative AUC" is the area under the curve up to a particular threshold value.

Cumulative AUC (T = .5) |

Cumulative AUC Comparison |

Brad Llewellyn

Data Scientist

Valorem

@BreakingBI

www.linkedin.com/in/bradllewellyn

llewellyn.wb@gmail.com

awesome break down of that confusing "Evaluation Table" . Thanks so much!

ReplyDeleteOnce we know our ideal threshold, how do we tell Azure to use this threshold for classification?

ReplyDeleteThe output from "Score Model" contains a "Scored Probabilities" column. We could easily use a custom SQL/R/Python script to add a new field to our dataset using an IF or CASE statement. However, this new field may not work with "Evaluate Model" anymore. Alas, you could always use the slicer in "Evaluate Model" to tinker with the thresholds in the viz.

DeleteIf you evaluate a model's performance and determine that it works best at a threshold that is different from the default of .5, how can you train the model with the different threshold to achieve more optimal results?

ReplyDelete