Monday, January 2, 2017

Azure Machine Learning: Classification Using Two-Class Support Vector Machine

Today, we're going to continue looking at Sample 3: Cross Validation for Binary Classification Adult Dataset in Azure Machine Learning.  In the three previous posts, we looked at the Two-Class Averaged PerceptronTwo-Class Boosted Decision Tree and Two-Class Logistic Regression algorithms.  The final algorithm in the experiment is Two-Class Support Vector Machine.  Let's start by refreshing our memory on the data set.
Adult Census Income Binary Classification Dataset (Visualize)
Adult Census Income Binary Classification Dataset (Visualize) (Income)


This dataset contains the demographic information about a group of individuals.  We see the standard information such as Race, Education, Martial Status, etc.  Also, we see an "Income" variable at the end.  This variable takes two values, "<=50k" and ">50k", with the majority of the observations falling into the smaller bucket.  The goal of this experiment is to predict "Income" by using the other variables.  Let's take a look at the Two-Class Support Vector Machine algorithm.
Two-Class Support Vector Machine
The Two-Class Support Vector Machine algorithm attempts to define a boundary between the two sets of points such that all of the points of one type fall on one side and all of the points of the other type fall on the other side.  More specifically, it attempts to define the boundary where the distance between the two sets of points is at its largest.  This is a relatively simple concept to imagine in two dimensions, but gets complex as your number of factors increases and the relationship between the factors becomes more complex.  Here's a picture that tells the story pretty nicely.
Support Vector Machine
Let's take a look at the parameters involved in this algorithm.  First, we need to define the "Number of Iterations".  Simply put, more iterations means that the algorithm is less likely to get stuck in an awkward portion of data.  Therefore, it increases the accuracy of your predictions.  Unfortunately, this also means that the algorithm will take longer to train.

The "Lambda" parameter allows us to tell Azure ML how complex we want our model to be.  The larger we make our "Lambda", the less complex our model will end up being.

The "Normalize Features" parameter will replace all of our values with "Normalized" values.  This is accomplished by taking each value, subtracting the mean of all the values in the column, then dividing the result by the standard deviation of all the values in the column.  This has the effect of making every column have a mean of 0 and a standard deviation of 1.  Since the algorithm chooses a boundary based on distance between points, it is imperative that your values be normalized.  Otherwise, you may have a single (or small subset) of factors that dominate the selection process because they have very large values, and therefore very large distances.  If we wanted to have certain factors play a larger role in the selection process for some type of technical or business reason, then we could forego this option.  However, that situation would be better handled by multiplying the normalized factors by our own custom sets of "weights" using a separate module.

The "Project to Unit Sphere" parameter allows us to normalize our set of output "Coefficients" as well.  In our testing, this didn't seem to have any impact on the predictability of the model.  However, it may be useful if we need to use the coefficients as inputs to some other type of model which would require them to be normalized.  If anyone knows of any other uses, let us know in the comments.

The "Allow Unknown Categorical Levels" parameter allows us to set whether we want to allow NULLs to be used in our model.  If we try to pass in data that has NULLs, we may get some errors.  If our data has NULLs, we should check this box.

If you want to learn more about the Two-Class Support Vector Machine algorithm, read this and this.  Let's use Tune Model Hyperparameters to find the best set of parameters for our Two-Class Support Vector Machine algorithm.  If you want to learn more about Tune Model Hyperparameters, check out our previous post.
Tune Model Hyperparameters
Tune Model Hyperparameters (Visualize)
As you can see, the best model has 25 iterations with a Lambda of .001274.  Let's plug that into our Two-Class Support Vector Machine algorithm and move on to Cross-Validation.
Cross Validate Model
Contingency Table (Two-Class Averaged Perceptron)

Contingency Table (Two-Class Boosted Decision Tree)

Contingency Table (Two-Class Logistic Regression)

Contingency Table (Two-Class Support Vector Machine)
As you can see, the Two-Class Support Vector Machine approach has about the same amount of True positives for "income = '<=50k'" as the rest of the models.  However, the number of true positives for "income = '>50k'" is significantly less than that of the Two-Class Boosted Decision Tree.  Therefore, using accuracy alone, we can say that the Two-Class Boosted Decision Tree model is the best model for this data.

We've mentioned a couple of times that there are more ways to measure "goodness" of a model besides Accuracy.  In order to look at these, let's examine another module called "Evaluate Model".
Evaluate Model
There are no parameters to set for the "Evaluate Model" module.  All you do is provide it 1 or 2 scored datasets and it will provide a huge amount of information about the "goodness" of those models.  Here's a snippet of what you can find.
Roc Curve

Precision/Recall Curve

Lift Curve
The three charts shown above are the ROC Curve, Precision/Recall Curve, and Lift Curve.  We simply wanted to introduce these concepts to you in this post.  We'll spend a lot more time talking about these metrics in a later post.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
Data Scientist
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, December 12, 2016

Azure Machine Learning: Classification Using Two-Class Logistic Regression

Today, we're going to continue looking at Sample 3: Cross Validation for Binary Classification Adult Dataset in Azure Machine Learning.  In the two previous posts, we looked at the Two-Class Averaged Perceptron and Two-Class Boosted Decision Tree algorithms.  The next algorithm in the experiment is Two-Class Logistic Regression.  Let's start by refreshing our memory on the data set.
Adult Census Income Binary Classification Dataset (Visualize)

Adult Census Income Binary Classification Dataset (Visualize) (Income)
This dataset contains the demographic information about a group of individuals.  We see the standard information such as Race, Education, Martial Status, etc.  Also, we see an "Income" variable at the end.  This variable takes two values, "<=50k" and ">50k", with the majority of the observations falling into the smaller bucket.  The goal of this experiment is to predict "Income" by using the other variables.  Let's take a look at the Two-Class Logistic Regression Tool.
Two-Class Logistic Regression
Logistic Regression is one of the more "mathematically pure" methods for Two-Class Prediction.  We'd imagine that virtually all statistics majors learn about this procedure in school.  Logistic Regression is a cousin of Linear Regression.  The main difference being that Linear Regression applies a linear function (ax + by + c) to predict a continuous value, while Logistic Regression uses a logit transformation to predict a binary value.  You can read more about the Logit here if you are interested.  Let's move on to the parameters.

As with many advanced machine learning algorithms, Two-Class Logistic Regression runs through the algorithm multiple times to ensure that we get the best predictions possible.  This means that the algorithm needs to know when to stop.  Most algorithms will stop whenever the new model doesn't significantly deviate from the old model.  This is called "Convergence".  The "Optimization Tolerance" parameter tells the algorithm how close the models have to be in order for it to stop.

The "L1 Regularization Weight" and "L2 Regularization Weight" are used to prevent overfitting.  We've talked about overfitting in-depth in the previous post.  They do this by penalizing models that contain extreme coefficients.

The "L1 Regularization Weight" parameter is useful for "sparse" datasetsWe'.  A dataset is considered sparse when every combination of variables is either poorly represented or not represented at all.  This is extremely common when dealing with data sets with a small number of observations and/or a large number of variables.

The "L2 Regularization Weight" parameter is useful for "dense" datasets.  A dataset is considered dense when every combination of variables is well represented.  This is common when dealing with data sets with a large number of observations and/or a small number of variables.  You can also think of "dense" as the opposite of "sparse".

The "Memory Size for L-BFGS" parameter determines how much history to store on previous iterations.  The smaller you set this number, the less history you will have.  This will lead to more efficient computation and weaker predictions.

Finally, the "Random Number Seed" parameter is useful if you want reproducable results.  If you want to learn more about the Two-Class Logistic Regression procedure or any of these parameters, read here and here.

We're not sure what you think, but we have no idea what to enter for most of these parameters.  Good thing Azure ML has an algorithm that can optimize these parameters for us.  Let's take a look at the "Tune Model Hyperparameters" tool.
Condensed Experiment

Tune Model Hyperparameters
The Tune Model Hyperparameters tool takes three inputs, an Untrained Model (Two-Class Logistic Regression in our case), a Training Dataset, and an optional Validation (or Testing) Dataset (which we don't have).  Ironically, this tool is designed to help us choose parameters, yet has some serious parameters of its own.  For now, we'll stick with the defaults and leave the discussion of this tool for a later post.  The one parameter we do need to set is the "Selected Columns" parameter so the algorithm knows which value we are trying to predict.  This tool has two outputs, "Sweep Results" and "Trained Best Model".  The "Sweep Results" output shows all of the different test runs, as well as their resulting metrics.  Let's take a look at the "Trained Best Model" visualization.
Tune Model Hyperparameters (Trained Best Model)
As you can see, this visualization tells us what the best set of parameters was.  However, this tool technically outputs a Trained Model.  The Cross-Validate Model tool that we are using as the end of our experiment requires an Untrained Model.  So, to avoid having to introduce another new tool in this post, we'll just write these parameters into the Two-Class Logistic Regression tool.
Two-Class Logistic Regression (Tuned Hyperparameters)
Now, let's take a look at the Contingency Table from the Cross-Validate Model tool, and compare it to the tables from the other posts.
Contingency Table (Two-Class Averaged Perceptron)

Contingency Table (Two-Class Boosted Decision Tree)
Contingency Table (Two-Class Logistic Regression)
As you can see, all three of the algorithms are close when it comes to correctly predicting "<=50k".  The story changes when it comes to ">50k".  While the Two-Class Logistic Regression algorithm is better than the Two-Class Averaged Perceptron algorithm, it isn't quite as good as the Two-Class Boosted Decision Tree.  We should point out that comparing these contingency tables is only a small part of choosing the "best" model.  We can go more in-depth on model selection in a later post.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
BI Engineer
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, November 21, 2016

Azure Machine Learning: Classification Using Two-Class Boosted Decision Tree

Today, we're going to continue our walkthrough of Sample 3: Cross Validation for Binary Classification Adult Dataset.  In the previous post, we walked through the initial data load, as well as the Two-Class Averaged Perceptron algorithm.  Now, we're going to walk through the next algorithm, Two-Class Boosted Decision Tree.  Let's start with a simple overview of the experiment.
Sample 3: Cross Validation for Binary Classification Adult Dataset
The purpose of this data set is to take a dataset of Demographic data about individuals, and attempt to predict their income based on these factors.  Here's a snippet of that dataset.
Adult Census Income Binary Classification Dataset (Visualize)
Adult Census Income Binary Classification Dataset (Visualize) (Income)
If you want to learn more about the data import section of this experiment, check out the previous post.  Let's move on to the star of the show, Two-Class Boosted Decision Tree.  This is one of our favorite algorithms because it is incredibly simple to visualize, yet offers extremely powerful predictions.
Two-Class Boosted Decision Tree
This algorithm doesn't just construct one tree, it constructs as many as you want (100 in this case).  What's extremely interesting about these additional trees is that they are not independent of their predecessors.  According to MSDN, "the second tree corrects for the errors of the first tree, the third tree corrects for the errors of the first and second trees, and so forth."  This means that our trees should get better as we increase our "Number of Trees Constructed" parameter.  Unfortunately, this would mean that trees later in the process have a much higher risk of "Overfitting" than trees earlier in the process.  "Overfitting" is a situation where the model has been trained so heavily that it can extremely accurately predict your training data, but be very poor at predicting new observations.  Fortunately, the algorithm accounts for this by not just taking the prediction from the final tree in the set.  It takes predictions from every tree and averages them together.  This greatly lessens the effect of "Overfitting" while still providing accurate predictions.

The "Maximum Number of Leaves per Tree" parameter allows us to set the number of times the tree can split.  It's important to note that splits early in the tree are caused by the most significant predictors, while splits later in the tree are less significant.  This means that the more leaves you have (and therefore more splits), the higher your chance of overfitting is.  This is why Validation is so important.

The "Minimum Number of Samples per Leaf Node" parameters allows us to set the significance level required for a split to occur.  With this value set at 10, the algorithm will only choose to split (this known as creating a "new rule") if at least 10 rows, or observations, will be affected.  Increasing this value will lead to broad, stable predictions, while decreasing this value will lead to narrow, precise predictions.

The "Learning Rate" parameter allows us to set how much difference we see from tree to tree.  MSDN describes this quite well as "the learning rate determines how fast or slow the learner converges on the optimal solution. If the step size is too big, you might overshoot the optimal solution. If the step size is too small, training takes longer to converge on the best solution."

Finally, this algorithm lets us select a "Create Trainer Mode".  This is extremely useful if we can't decide exactly what parameters we want.  We'll take more about parameter selection in a later post.  If you want to learn more about this algorithm, read here and here.  Let's visualize this tool.
Two-Class Boosted Decision Tree (Visualize)
Just like with the Two-Class Averaged Perceptron algorithm, the visualization of the untrained model is not very informative.  Strangely enough, this visualization shows us the correct parameters, whereas the Two-Class Averaged Perceptron did not.  What would be far more interesting is if we could look at the trained tree.  In order to do this, we need to add a new tool to our experiment, Train Model.
Condensed Experiment
Train Model
The Train Model initialization is pretty simple.  All we need to do is select our variable of interest, which is "Income" in this case.  Let's take a look at the visualization.
Train Model (Visualization)
As you can see, this visualization lets you look through all the trees created in the training process.  Let's zoom in on a particular section of the tree.
Train Model (Visualization) (Zoom)
EDIT: At the time of writing this, there is a bug related to the display of predictions within Decision Trees.  Please see here for more details.

As you can see, each split in the tree relies on a single variable in a single expression, known as a predicate.  The first predicate says

marital-status.Married-civ-spouse <= 0.5

We've talked before about the concept of Dummy Variables.  When you pass a categorical variable to a numeric algorithm like this, it has to translate the values to numeric.  It does this by creating Dummy, or Indicator, Variables.  In this case, it created Dummy Variables for the "marital-status" variable.  One of these variables is "marital-status.Married-civ-spouse".  This variable takes a value of 1 if the observation has "marital-status = Married-civ-spouse" and 0 otherwise.  Therefore, this predicate is really just a numeric way of saying "Does this person have a Marital Status of "Married-Civ-Spouse".  We're not sure exactly what this means because this isn't our data set, but it's the most common variable in the dataset.  Therefore, it probably means being married and living together.

Under the predicate definition, we also see a value for "Split Gain".  This is a measure of how significant the split was.  A large value means a more significant split.  Since Google is our best friend, we found a very informative answer on StackOverflow explaining this.  You can read it here.

What we find very interesting about this tree structure is that it is not "balanced".  This means that in some cases, we can reach a prediction very quickly or very slowly depending on which side of the tree we are on.  We can see one prediction in Level 2 (the root level is technically considered Level 0.  This means that the 3rd level is considered Level 2).  We're not sure what causes the tree to choose whether to predict or split.  The MSDN article seems to imply that it's based on the combination of the "Minimum Number of Samples per Leaf Node" as well as some internal Information (or Split Gain) threshold.  Perhaps one of our readers can enlighten us about this.

Since we talked heavily about Cross-Validation in the previous post, we won't go into too much detail here.  However, it may be interesting to see the Contingency table to determine how well this model predicted our data.
Contingency Table (Two-Class Boosted Decision Tree)
Contingency Table (Two-Class Averaged Perceptron)
As you can see, the number of correct predictions for "Income <= 50k" is about the same between the two algorithms, but the Two-Class Boosted Decision Tree wins when it comes to the "Income > 50k" category.  We would need more analysis to make a sound decision, but we'll have to save that for a later post.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
BI Engineer
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, October 31, 2016

Azure Machine Learning: Classification Using Two-Class Averaged Perceptron

Today, we're going to walk through Sample 3: Cross Validation for Binary Classification Adult Dataset.  So far, the Azure ML samples have been interesting combinations of tools meant for learning the basics.  Now, we're finally going to get some actual Data Science!  To start, here's what the experiment looks like.
Sample 3: Model Building and Cross-Validation
We had originally intended to skim over all of these models in a single post.  However, it was so interesting that we decided to break them out into separate posts.  So, we'll only be looking at the initial data import and the model on the far left, Two-Class Averaged Perceptron.  The first tool in this experiment is the Saved Dataset.
Adult Census Income Binary Classification Dataset
As usual, the first step of any analysis is to bring in some data.  In this case, they chose to use a Saved Dataset called "Adult Census Income Binary Classification dataset".  This is one of the many sample datasets that's available in Azure ML Studio.  You can find it by navigating to Saved Datasets on the left side of the window.
Sample Dataset
Let's take a peek at this data to see what we're dealing with.
Adult Census Income Binary Classification Dataset (Visualize)
As you can see, this dataset has about 32k rows and 15 columns.  These columns appear to be descriptions of people.  We have some of the common demographic data, such as age, education, and occupation, with the addition of an income field at the end.
Adult Census Income Binary Classification Dataset (Visualize) (Income)
This field takes two values, "<=50k" and ">50k", with the majority of people being in the lower bucket.  This would be a great dataset to do some predictive analytics on!  Let's move on to the next tool, Partition and Sample.
Partition and Sample
This is a pretty cool tool that allows you to trim down your rows in a few different ways.  We could easily spend an entire post talking about this tool; so we'll keep it brief.  You have four different options for "Partition or Sample Mode".  In this sample, they have selected "Sampling" with a rate of .2 (or 20%).  This allows us to take random samples from our data.  We also have the "Head" option which allows us to pass through the top N rows, which would be really good if we were debugging a large experiment and didn't want to wait for the sampling algorithm to run.  We also have the option to sample Folds, which is another name for a partition or subset of data.  Let's take a quick look at the visualization to see if anything else is going on.
Partition and Sample (Visualize)
We have the same 15 columns as before, the only difference is that we only 20% of the rows.  Nothing else seems be happening here.  Let's move on.
Clean Missing Data
In a previous post, we've spoken about the Clean Missing Data tool in more detail.  To briefly summarize, you can tell Azure ML how to handle missing (or null) values.  In this case, we are telling the algorithm to replace missing values in all columns with 0.  Let's move on to the star of the show, Model Building.  We'll start with the model on the far left, Two-Class Averaged Perceptron.
Two-Class Averaged Perceptron
This example is really interesting to us because we've never heard of it before writing this.  The Two-Class Averaged Perceptron algorithm is actually quite simple.  It takes a large number of numeric variables (it will automatically translate Categorical data into Numeric if you give it any.  These new variables are called Dummy Variables.).  Then, it multiplies the input variables by weights and adds them together to produce a numeric output.  That output is a score that can be used to choose between two different classes.  In our case, these classes are "Income <= 50k" and "Income > 50k".  Some of you might think this logic sounds very similar to a Neural Network.  In fact, the Two-Class Averaged Perceptron algorithm is a simple implementation of a Neural Network.

This algorithm gives us the option of providing three main parameters, "Create Trainer Mode", "Learning Rate" and "Maximum Number of Iterations".  "Learning Rate" determines how many steps the algorithm takes in order to calculate the "best" set of weights.  If the "Learning Rate" is too high (making the number of steps too low), the model will train very quickly, but the weights may not be a very good fit.  If the "Learning Rate" is too low (making the number of steps too high), the model will train very slowly, but could possibly produce "better" weights.  There are also concerns of Overfitting and Local Extrema to contend with.

 "Maximum Number of Iterations" determines how many times times the model is trained.  Since this is an Averaged Perceptron algorithm, you can run the algorithm more than once.  This will allow the algorithm to develop a number of different sets of weights (10 in our case).  These sets of weights can be averaged together to get a final set of weights, which can then be used to classify new values.  In practice, we could achieve the same result by creating 10 scores using the 10 sets of weights, then averaging the scores.  However, that method would seem to be far less efficient.

Finally, we have the "Create Trainer Mode" parameter.  This parameter allows us to pass in a single set of parameters (which is what we are currently doing) or pass in multiple sets of parameters.  You can find more information about this algorithm here and here.

This leaves us with a few questions that perhaps some readers could help us out with.  If you have 10 iterations, but set a specific random seed, does it create the same model 10 times, then average 10 identical weight vectors to get a single weight vector?  Does it use the random seed to create 10 new random seeds, which are then used to create 10 different weight vectors?  What happens if you define a set of 3 Learning Rates and 10 Iterations?  Will the algorithm run 30 iterations or will it break the iterations into sets of 3, 3, and 4 to accomodate each of the learning rates?  If you know the answers to these questions, please let us know in the comments.  Out of curiosity, let's see what's under Visualize for this tool.
Two-Class Averaged Perceptron (Visualize)
This is interesting.  These aren't the same parameters that we input into the tool, nor do they seem to be affected by our data stream.  This is a great opportunity to point out the way that models are built in Azure ML.  Let's take a look at data flow
Data Flow
I've rearranged the tools slightly to make it more obvious.  As you can see, the data does not flow into the model directly.  Instead, the model metadata is built using the model tool (Two-Class Averaged Perceptron in this class).  Then, the model metadata and the sample data are consumed by whatever tool we want to use downstream (Cross Validate Model in this case).  This means that we can reuse a model multiple times just by attaching it to different branches of a data stream.  This is especially useful when we want to use the same model against different data sets.  Let's move on to the final tool in this experiment, Cross Validate Model.
Cross Validate Model
Cross-Validation is a technique for testing, or "validating", a model.  Most people would test a model by using a Testing/Training split.  This means that we split our data into two separate sets, one for training the model and another one for testing the model.  This methodology is great because it allows us to test our model using data that it has never seen before.  However, this method is very susceptible to bias if we don't sample our data properly, as well as sample size issues.  This is where Cross-Validation comes in.

Imagine that we used the Testing/Training method to create a model using the Training data, then tested the model using the Testing data.  We could estimate how accurate our model is by seeing how well it predicts known values from our Testing data.  But, how do we know that we didn't get lucky?  How do we know there isn't some strange relationship in our data that caused our model to predict our Testing data well, but predict real-world data poorly?  To do this, we would want to train the model multiple times using multiple sets of data.  So, we separate our data into 10 sets.  Nine of the sets are used to train the model, and the remaining set is used to test.  We could repeat this process nine more times by changing which of the sets we use to test.  This would mean that we have created 10 separate models using 10 different training sets, and used them to predict 10 mutually exclusive testing sets.  This is Cross-Validation.  You can find out more about Cross-Validation here.  Let's see it in action.
Scored Results (1)
There are two outputs from the Cross Validate Model tool, Scored Results and Evaluation Results by Fold.  The Scored Results output shows us the same data that we passed into the tool, with 3 additional columns.  The first column, Fold Assignments, is added to the start of the data.  This tells us which of the 10 sets, or "folds", the row was sampled into.
Scored Results (2)
The remaining columns, Scored Labels and Scored Probabilities, are added to the end of the data.  The Scored Labels column tells us which category the model predicted this row would fall into.  This is what we were looking for all along.  The Scored Probability is a bit more complicated.  Mathematically, the algorithm wasn't trying to predict whether Income was "<=50k" or ">50k".  It was only trying to predict ">50k" because in a Two-Class algorithm, if you aren't ">50k", then you must be "<=50k".  If you looked down the Scored Probabilities column, you would see that all Scored Probabilities less than .5 have a Scored Label of "<=50k" and all Scored Probabilities greater than .5 have a Scored Label of ">50k".  If were using a Multi-Class algorithm, it would be far more complicated.  If you want to learn about the Two-Class Average Perceptron algorithm, read here and here.

There is one neat thing we wanted to show using this visualization though.
Scored Results (Comparison)
When we click on the "Income" column, a histogram will pop up on the right side of the window.  If we click on the "Compare To" drop-down, and select "Scored Labels", we get a very interesting chart.
Contingency Table
This is called a Contingency Table, also known as a Confusion Matrix or a Crosstab.  It shows you the distribution of your correct and incorrect predictions.  As you can see, our model is very good at predicting when a person has an Income of "<=50k", but not very good at predicting ">50k".  We could go much deeper into the concept of model validation, but this was an interesting chart that we stumbled upon here.  Let's look at the Evaluation Results by Fold.
Evaluation Results by Fold
This shows you a bunch of statistics about your Cross-Validation.  The purpose of this post was to talk about the Two-Class Averaged Perceptron, so we won't spend much time here.  However, don't be surprised if we make a full-length post about this in the future because there is a lot of information here.

We hope that this post sparked as much excitement in you as it did in us.  We're really starting to see just how much awesomeness is packed into Azure Machine Learning Studio; and we're so excited to keep digging.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
BI Engineer
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com