Monday, November 19, 2018

Data Science in Power BI: R Scripts in Query Editor

Today, we're going to talk about R Scripts in Query Editor within Power BI.  If you haven't read the earlier posts in this series, Introduction, Getting Started with R Scripts, Clustering, Time Series Decomposition, Forecasting, Correlations and Custom R Visuals, they may provide some useful context.  You can find the files from this post in our GitHub Repository.  Let's move on to the core of this post, R Scripts in Query Editor.

We haven't covered Power Query in a very long time on this blog.  Since our last post in 2013, it's been fully integrated into the Power BI Desktop tool.  Power Query is now known as the Power BI Query Editor and is a fully-functional ETL (Extract, Transform, Load) tool capable of pulling data from many data sources and integrating them into a single Power BI Data Model.  You can read more about it here.

The first step is to load some data using the "Get Data" feature in Power BI.
Get Data
For this post, we'll be using the AdventureWorksLT sample database from Azure SQL Database.  You can spin one of these up for free using an Azure trial account.  Once we select "Get Data", we need to connect to our Azure SQL Database.
Connect to Azure SQL Database
SQL Server Database
We have the option to select between Import and DirectQuery modes.  There's enough content there to be a post on its own.  The default mode is Import, which is what we typically use.  You can find out more information on DirectQuery mode here.  After this we also need to provide our credentials.  Then, we can select the tables or views that we want to import.
Navigator
In the "Navigator" window, we want to select the Customer, Product, SalesOrderDetail and SalesOrderHeader tables.  Now, we can use "Edit Queries" to access the Query Editor.
Edit Queries
Power Query Editor
Once in the Query Editor, we can see the queries that pull in our four tables, as well as previews of these tables.  The Query Editor has a tremendous amount of built-in capabilities to do any type of traditional data transformations.  We're interested in the R functionality.  Starting from the SalesLT SalesOrderHeader table, in the "Transform" tab, we can select "Run R Script" to open a scripting window.
Run R Script
R Script Editor
If you aren't able to open the R Script Editor, check out our previous post, Getting Started with R Scripts.  While it's possible to develop and test code using the built-in R Script Editor, it's not great.  Unfortunately, there doesn't seem to be a way to develop this script using an external IDE like RStudio.  So, we typically export files to csv for development in RStudio.  This is obviously not optimal and should be done with caution when data is extremely large or sensitive in some way.  Fortunately, the write.csv() function is pretty easy to use.  You can read more about it here.

<CODE START>

write.csv( dataset, file = <path> )

<CODE END>

Write to CSV
Now, all we need to do is open RStudio and use read.csv() to pull the data in.  We saved it as dataset to be consistent with Power BI's naming convention.
Read from CSV
Looking at this dataset, we see that it contains a number of fields.  The issue is that it seems to contains a bunch of strange text values such as "Microsoft.OleDb.Currency", "[Value]" and "[Table]".  These are caused by Base R not being able to interpret these Power Query datatypes.  Obviously, there's nothing we can do with these fields in this format.  So, we'll use Power Query to trim the dataset to only the columns we want and change the datatype of all of the other columns to numeric or text.  To avoid delving too deep into Base Power Query, here's the M code to accomplish this.  M is the language that the Query Editor automatically generates for you.

<CODE START>

 #"Removed Columns" = Table.RemoveColumns(SalesLT_SalesOrderHeader,{"SalesLT.Address(BillToAddressID)", "SalesLT.Address(ShipToAddressID)", "SalesLT.Customer", "SalesLT.SalesOrderDetail"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"TotalDue", type number}, {"Freight", type number}, {"TaxAmt", type number}, {"SubTotal", type number}}),

<CODE END>
Clean Data
Looking at the new data, we can see that it's now clean enough to work with in R.  Next, we'll do some basic R transformations to get a hang of it.  Let's start by removing all columns from the data frame except for SalesOrderID, TaxAmt and TotalDue.  We'll post the code later in the post.
Tax
Now, let's create a new column "TaxPerc" that is calculated as "TaxAmt" / "TotalDue".
We can see that the tax percentages all hover right around 7.2%.  It looks like the difference is probably just rounding error.  Nevertheless, let's see how this outputs to the Query Editor.
Tax in Query Editor
We can see the same data in Power Query that we saw in RStudio.  This leads to the question "How did Power Query know to use the tax data frame as the next step?"  If we look at the "Applied Steps" pane on the right side of the screen, we see that the Query Editor added a "tax" step without asking us.  If we select the "Run R Script" step, we can see what the intermediate results looked like.
Run R Script in Query Editor
Apparently, the Query Editor pulls back every data frame that was created in the R script.  In this case, there was only one, so it expanded it automatically.  We can confirm this by altering the R code to create a new data frame that similarly calculates "FreightPerc".
Tax and Freight in Query Editor
Here's the code we used to generate this.

<CODE START>

tax <- dataset[,c("SalesOrderID", "TaxAmt", "TotalDue")]
tax[,"TaxPerc"] <- tax[,"TaxAmt"] / tax[,"TotalDue"]

freight <- dataset[,c("SalesOrderID", "Freight", "TotalDue")]
freight[,"FreightPerc"] <- freight[,"Freight"] / freight[,"TotalDue"]

<CODE END>

Obviously, we could use the fundamentals we just learned, combined with what we worked on in our previous post <INSERT LINK HERE> to build some predictive models here.  Feel free to try that out on your own.  For now, we'd rather look at some slightly different functionality.

There are a number of libraries that allow us to use R to perform data manipulation exercises, such as aggregations and joins.  One of the more common packages is "dplyr".  You can read more about it here.  Let's install this package using RStudio and get started.  If you need a refresher on how to install packages, check out this post.

Now, let's start by calculating the total sales per customer.
Total Sales by Customer
We see that the dplyr syntax is pretty simple and feels similar to SQL.  However, these functions don't output data frames.  Instead, they output tibbles.  Fortunately, the Query Editor can read these with no issues.
Total Sales in Query Editor
Here's the code we used.

<CODE START>

library(dplyr)
totalsales <- dataset %>%
  group_by(CustomerID) %>%
  summarise(
    TotalSales = sum(TotalDue)
  )

<CODE END>

Finally, let's take a look at table joins using dplyr.  In order to do this, we need a way to bring two tables into the R script once.  Fortunately, we can do this by tampering with the M code directly.
Blank R Script
If we look at our existing Power Query query, we see that the Query Editor leverages the R.Execute() function with an additional parameter at the end that says '[dataset=#"ChangedType"]'.  This tells M to bring in the #"Changed Type" object into R and assign it to the dataset data frame.  The #"Changed Type" object in the Query Editor is simply the temporary table created by our previous applied step.
Changed Type
First, let's create a Blank M Script.
Blank Query
Then, we can open the "Advanced Editor".
Advanced Editor
Advanced Editor Window
Here, we see an empty M query.  Let's use the R.Execute() function to pull in two other queries in the Query Editor.
Two Tables
Here's the code we used to generate this.

<M CODE START>

let
    Source = R.Execute("salesorderdetail <- sod#(lf)salesorderheader <- soh",[soh=#"SalesLT SalesOrderHeader", sod=#"SalesLT SalesOrderDetail"])
in
    Source

<M CODE END>

<R CODE START>

salesorderdetail <- sod
salesorderheader <- soh

<R CODE END>

We also dropped a few unnecessary columns from the SalesOrderDetail table and changed some data types to avoid the issues we saw earlier.  Let's join these tables using the "SalesOrderID" key.
Joined in Query Editor
Here's the code we used.

<CODE START>

library(dplyr)
joined <- inner_join(sod, soh, by = c("SalesOrderID"))

<CODE END>

We see that it's very easy to do this as well.  In fact, it's possible to add joins to the "%>%" structure we were doing before.  To learn more about these techniques, read this.

Hopefully, this post enlightened you as to some ways to add the power of R to your Power BI Queries.  Not to give it all away, but we'll showcase some very interesting ways to build on these capabilities in the coming posts.  Stay tuned for the next post where we'll take a look at the new Python functionality.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
Senior Analytics Associate - Data Science
Syntelli Solutions
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, October 29, 2018

Data Science in Power BI: Creating Custom R Visuals

Today, we're going to talk about Creating Custom R Visuals within Power BI.  If you haven't read the earlier posts in this series, Introduction, Getting Started with R Scripts, Clustering, Time Series Decomposition, Forecasting and Correlations, they may provide some useful context.  You can find the files from this post in our GitHub Repository.  Let's move on to the core of this post, Creating Custom R Visuals.

In Power BI, it's possible to create custom visuals using TypeScript.  It's also possible to have these visuals leverage custom R code.  You can find out more about this process here.  While extremely interesting, these is NOT the type of visuals we'll be talking about today.

Over the last few posts, we've shown how to use custom R visuals built by others.  Today, we're going to build our own using the Custom R Visual available in Power BI Desktop.  If you haven't read the second post in this series, Getting Started with R Scripts, it is highly recommended you do so now, as it provides necessary context for how to link Power BI to your local R ISE.

In the previous post, we created a bunch of log-transformed measures to find good predictors for Revenue.  We're going to use these same measures today to create a basic linear regression model to predict Revenue.  If you want to follow along, the dataset can be found here.  Here's the custom DAX we used to create the necessary measures.

<CODE START>

Revenue Current Month (Log) = IFERROR( LN( CALCULATE( [RevenueTY], 'Date'[YearPeriod] = "201411" ) ), BLANK() )

Revenue Previous Month (Log) = IFERROR( LN( CALCULATE( [RevenueTY], 'Date'[YearPeriod] = "201410" ) ), BLANK() )

COGS Previous Month (Log) = IFERROR( LN( CALCULATE( [Total COGS], 'Date'[YearPeriod] = "201410" ) ), BLANK() )

Labor Cost Previous Month (Log) = IFERROR( LN( CALCULATE( [Sum of Labor Costs Variable], 'Date'[YearPeriod] = "201410" ) ), BLANK() )

Third Party Costs Previous Month (Log) = IFERROR( LN( CALCULATE( [Sum of Cost Third Party], 'Date'[YearPeriod] = "201410" ) ), BLANK() )

Travel Expenses Previous Month (Log) = IFERROR( LN( CALCULATE( [Sum of Travel Expenses], 'Date'[YearPeriod] = "201410" ) ), BLANK() )

Rev for Exp Travel Previous Month (Log) = IFERROR( LN( CALCULATE( [Sum of Rev for Exp Travel], 'Date'[YearPeriod] = "201410" ) ), BLANK() )

<CODE END>

Next, we create an empty "Custom R Visual".
Custom R Visual
Then, we add "Name" from the Customer table and all of our new measures to the Values Shelf for the Custom R Visual.
Values Shelf
Now, we could begin writing our script directly in Power BI.  However, it's very difficult to test and debug within Power BI.  So, we recommend using an external ISE, such as RStudio.  Fortunately, Power BI has a button that exports the data to a csv and loads it into the ISE we specified in the options.
Edit Script in External ISE
In our ISE, the first thing we want to do is take a look at our dataset using the following command:
head(dataset)
Dataset Head
We can see that we have nulls (R calls them NAs) in our dataset.  In our case, we can assume that all of these nulls are actually 0's.  Please note that this is not always the case, but we are assuming it in this case.  We've done a number of posts in the past on different ways to handle "imputation".  Check them out here and here.  There are probably some more further back if you want to dig.  For now, all we want to do is replace all NA's with 0's.  We can do this using the following command:
dataset[is.na(dataset)] <- 0
head(dataset)
Imputed Dataset Head
Now, we're ready to build a basic linear regression model.  However, we don't want to include the Name column, as it provides no analytical value.  We can do this using the following command:
reg <- lm(`Revenue Current Month (Log)` ~ . - `Name` , data = dataset )
summary(reg)
Regression Results
Those with experience in linear regression will notice that an "Adjusted R-Squared" of .08 and a "P-Value" of .15 mean that this model is terrible.  However, the purpose of this point is on visualizations, not regression models!  We have a previous post that touches on the finer points of regression.

Now that we have a regression model, we can plot some interesting things about it.  Our original dataset contained seven columns, not counting "Name".  We can't plot seven dimensions on a scatterplot.  So, we need another way to look at our regression model.  One of the ways to do this is with a Residuals vs. Fitted plot.  Basically, this plot shows how "far off" the regression model was.  A better model would have residual values tightly clustered around zero, whereas a worse model would have larger residual values.  We can make one using the following code:
plot(x = reg$fitted.values, y = reg$residuals, xlab = "Fitted Values", ylab = "Residuals")
Residuals vs. Fitted Plot
While this chart is interesting, it's tough to see where zero is for the residuals.  Let's add a horizontal line to make it more obvious.  We can accomplish this using the following code:
abline(h = 0)
Residuals vs. Fitted (Horizontal Line)
This is becoming easier to understand.  Now we can see that the majority of our points are slightly above the zero line, with a few points very far below.  It's pretty obvious that these three points are having a major impact on our model.  This is the value of using a Residuals vs. Fitted plot.  Alas, this post is not about improving our model.  Let's add one final detail, the R-Squared value.  We can use the following code to add the R-Squared value to the title:
plot(x = reg$fitted.values, y = reg$residuals, xlab = "Fitted Values", ylab = "Residuals", main = paste( "Predicting Revenue Current Month (Log): R-Squared = ", round(summary(reg)$r.squared, digits = 3), sep=""))
abline(h = 0)
Now we have a plot that gives us some interesting information about our data.  The final step is to take the R code we've created and copy it into the Custom R Visual in Power BI.  We don't need any of the heads or summaries since Power BI won't output them anyway.  Here's all the code we need:
dataset[is.na(dataset)] <- 0
reg <- lm(`Revenue Current Month (Log)` ~ . - `Name` , data = dataset )
plot(x = reg$fitted.values, y = reg$residuals, xlab = "Fitted Values", ylab = "Residuals",
      main = paste( "Predicting Revenue Current Month (Log): R-Squared = ", round(summary(reg)$r.squared, digits = 3), sep=""))
abline(h = 0)
Now for the final piece.  One of the main reasons this R functionality is so powerful is that you can filter this R visual by using other Power BI charts or slicers.  For instance,  we can see the difference in our regression before and after excluding certain products.
Custom R Visual (All Products)
Custom R Visual (Some Products)
We can see that our R-Squared increased by .1 when we removed the (Blank) and Sova products.  This would be a great way for us to see which products may follow a different sales pattern.

Hopefully, this post opened your minds to the possibilities of creating your own R visuals within Power BI.  Obviously, we only scratched the surface of this incredibly complex topic.  There are many excellent blog posts around the internet showcasing some of the coolest visuals available in R.  We encourage you to try some of them out.  Stay tuned for the next post where we'll dig into R Scripts in Power Query to see what they have to offer.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
Senior Analytics Associate - Data Science
Syntelli Solutions
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, October 8, 2018

Data Science in Power BI: Correlations

Today, we're going to talk about Correlations within Power BI.  If you haven't read the earlier posts in this series, Introduction, Getting Started with R Scripts, Clustering, Time Series Decomposition and Forecasting, they may provide some useful context.  You can find the files from this post in our GitHub Repository.  Let's move on to the core of this post, Correlations in Power BI.

Correlation is a measure of how different values tend to relate to each other.  When we talk about correlation in a statistical context, we are typically referring to Pearson Correlation.  Pearson Correlation is the measure of how linear the relationship between two sets of values is.  For instance, values that fall perfectly in a "up and to the right" line would have a correlation of 1, while values that fall roughly on that line may have a correlation closer to .5.  These values can even be negative if the line travels "down and to the right".
Pearson Correlation
In many industries, the ability to determine which values tend to correlate with each other can have tremendous value.  In fact, one of the first steps commonly performed in the data science process is to identify variables that highly correlate to the variable of interest.  For instance, if we find that Sales is highly correlated to Customer Age, we could utilize Customer Age in a model to predict Sales.

So, how can we use Power BI to visualize correlations between variables?  Let's see some different ways.  We'll start by making a basic table with one slicer and two measures.
Revenue Current Month and Revenue Previous Month by Customer Name (Table)
This table lets us see the current and previous month's revenue by customer.  While this is good for finding individual customers, it doesn't give us a good idea of how closely related these two measures are.  Scatterplots are usually much better at visualizing this type of information.  Let's switch over to one of those.
Revenue Current Month and Revenue Previous Month by Customer Name (Scatterplot)
We can add a trend line to this graph by using the "Analytics" pane.
Add Trend Line
Revenue Current Month and Revenue Previous Month by Customer Name (Trend)
There's something missing with the way this data is displayed.  It's very difficult to understand our data when it looks like this.  Given the nature of money, it's common for a few large customers to have very large values.  One way to combat this to change the scale of our data by using the logarithm transformation.  It's important to note that the LN() function in DAX returns an error if it receives a negative or zero value.  This can be remedied using the IFERROR() function.
Revenue Current Month and Revenue Previous Month by Customer Name (Log)
We can see now that our relationship is much more linear.  It's important to note that Pearson correlation is only applicable to linear relationships.  By looking at this scatterplot, we can guess that our correlation is somewhere between .6 (60%) and .8 (80%).

Now, how would we add another variable to the mix?  Let's try with COGS.
Revenue Current Month and Revenue Previous Month by Customer Name (COGS)
It's not easy to see which scatterplot has the higher correlation.  In addition, this solution required us to create another chart.  While this is very useful for determining if any transformations are necessary (which they were), it isn't very scalable to being able to visualize a large number of variables at once.  Fortunately, the Power BI Marketplace has a solution for this.
Correlation Plot
If you haven't read the previous entries in this series, you can find information on loading Custom R Visuals in this post.  Once we load the Correlation Plot custom visual, we can utilize it pretty simply.
Correlations
We made one tweak to the options to get the coefficients to display, but that's it.  This chart can very easily allow us to look across a number of variables at once to determine which ones are correlated heavily.  This, combined with the scatterplots we saw earlier, gives us quite a bit of information about our data that could be used to create a great predictive or clustering model.

Hopefully, this post piqued your interest to investigate the options for visualizing correlations within Power BI.  This particular custom visual has a number of different options for changing the visual, as well as grouping variable together based on clusters of correlations.  Very cool!  Stay tuned for the next post where we'll dig into the R integration to create our own custom visuals.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
Senior Analytics Associate - Data Science
Syntelli Solutions
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, September 17, 2018

Data Science in Power BI: Forecasting

Today, we're going to talk about Forecasting within Power BI.  If you haven't read the earlier posts in this series, Introduction, Getting Started with R Scripts, Clustering and Time Series Decomposition, they may provide some useful context.  You can find the files from this post in our GitHub Repository.  Let's move on to the core of this post, Forecasting in Power BI.

Today, we're going to change it up a little and compare two visuals from the Power BI Marketplace.  These visuals are "Forecasting with ARIMA" and "Forecasting TBATS".
Forecasting Visuals
One of the most fascinating aspects of Data Science is how endless the process can be.  There are always many different ways to approach the same problem.  Let's dig a little deeper into the two algorithms we'll be looking at today.

ARIMA stands for "AutoRegressive Integrated Moving Average".  Basically, it's a method for breaking a time series model down into three components.  It's important to note that this type of model is not capable of utilizing multiple variables.  It simply predicts future values of a variable based on previous values of that variable.  Technically, the "Forecasting with ARIMA" model also includes a seasonal component as well.  However, it can only include one continuous model for the trend, denoted by (p,d,q) and one continous model for the season, denoted by (P,D,Q,m).  The results from these models are added together to get the value for each point in time.  You can read more about it here and here.

TBATS stands for "Trigonometric, Box-Cox Transform, ARMA Errors, Trend, Seasonal".  Basically, it's a method for predicting a time series that exhibits more than one seasonal effect.  For instance, retail sales is affected independently by day of the week and month.  Sales may go up in December because of Christmas and may go up further on the weekend because most people are not working.  There's not a ton of information online about this technique, but you can read more it here and here.

Before we can utilize these visuals in Power BI, we may need to install some R packages.  If you've been following along in this series, you'll only need the forecast package.  An earlier post in this series, Clustering, walks through this process in R Studio.  As with the previous posts, we'll be using the Customer Profitability Sample PBIX.  You can download it here if you don't already have it.  Let's create forecasts for [Total COGS] by [Date] using both of these visuals.
ARIMA (Initial)
It looks like the ARIMA visual creates a decent-looking forecast out of the box.  We'll definitely try tweaking this, but let's see what TBATS can do.
TBATS (Initial)
Ouch.  It looks like the default number of predictions out of TBATS is too large.  It's predicting every month out until July 2056.  The ARIMA model only predicted ten months.  Let's match that using the available parameters.
TBATS Forecasting Settings (10 Months)
TBATS (10 Months)


The scale of the graph is much better now, but the forecast is still pretty worthless.  As a side note, we originally wanted to use "Total Revenue" for this analysis.  Alas, we were unable to get any type of useful forecasts using that variable.

As a differentiator from the ARIMA chart, this chart type allows us to explicitly define the seasonality of our data.
TBATS (6 Month Season)
Unfortunately, there was only one combination of seasonalities that worked for this data.  We had to set the seasonal period to 6 months.  This does not bode well for this forecasting algorithm as ease of use and flexibility are key features.  Regardless, this does give us some predictions to work with.  Let's see them together to compare.
Together
Comparing these two, we see that the ARIMA forecast is more stable than the TBATS forecast.  However, it also has a much larger confidence interval.  With the forecasts we have here, it's impossible to determine how "accurate" the forecasts are because we don't know the future values.  In cases like this, it can be very helpful to use a holdout set.  This is when you build your forecasts leaving out the most recent data points so that you can compare the predicted values to the actual values.

This is where it gets tricky.  Our data starts in August 2013 and ends in November 2014.  That's 16 months of data.  This means that we would lose a huge chunk of information if we hold out too much information.  This is where a slightly more advanced technique comes into play, "one step ahead" forecasting.  Basically, we pick any historic point in the time series and remove all of the data points AFTER that point.  Effectively, we are building the time series model as we would have AT THAT POINT IN TIME.  Then, we use that model to predict the next point in time.  This mirrors how time series algorithms are utilized in practice.  To finish the approach, we replicate this technique for every point in time, effectively giving us predictions for every point in time.  Obviously, we can't go back too far, as time series trends change over large time periods and we don't have much data to begin with.  So, let's start by removing the three most recent months.
Three Month Holdout Filter
ARIMA (3M Holdout)
TBATS (3M Holdout)
We definitely have a problem.  The ARIMA model technically works with a three month holdout.  However, it gets reduced to an ARIMA(1,0,0) model which, as you can see, is just a straight line.  To make things worse, the TBATS model doesn't work at all.
ARIMA (1M Holdout)
TBATS (1M Holdout)
As it turns out, we are only capable of without a single month of data to perform our "one step ahead" analysis.  When we do this, our ARIMA model becomes the worthless ARIMA(0,0,0) that always predicts zero.  Our TBATS model is a little more useful.  We can hover over the lines to get the actual and predicted values for November 2014.
Predicted vs Actual
We see that the predicted value is 30% higher than the actual.  Obviously, this model is pretty terrible given everything we have tried so far.  So, what do we now?

We have a few options for how to proceed.  The best idea in this case would be to reduce the granularity of our dataset from months to weeks.  This would give us approximately 4.5x as many data points to work with.  In our case, that's not possible because the data is stored at the month level.  As a next best alternative, we could lean on the time series decomposition.  In our previous post, we explored this chart type.  Since we had to pivot to use [Total COGS] instead of [Total Revenue], here's what the decomposition for [Total COGS] looks like.
Decomposition
While this chart can't make accurate predictions for us, it can provide some basic insight.  For instance, we see that Trend and Seasonality both make up about 40% of the time series structure.  We also see that the Trend was increasing until the middle of the dataset (some time near the end of 2013 and the beginning of 2014) and began decreasing after that.  We can also see that there is a three month seasonal trend.  Since the last data point is in lowest point of the seasonal pattern and the trend seems to have flattened out, we can estimate that [Total COGS] will likely increase slightly for the next two months before falling in the third month.  While this isn't exactly what we were looking for, it is useful information that we may be able to leverage.

Hopefully, this post showcased some of the forecasting and time series analysis techniques available in Power BI.  These techniques require very little knowledge of statistical coding, but still allow us to get some valuable insights from our data.  Stay tuned for the next post where we will discuss Correlations.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
Senior Analytics Associate - Data Science
Syntelli Solutions
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com