Monday, August 8, 2016

Power BI/PowerPivot: Using LOOKUPVALUE() to Emulate Relationships

Today, we're going to talk about a neat function in DAX called LOOKUPVALUE().  This function is pretty aptly named, as it simply returns a value from a different (or possibly the same) table.  Let's start with a very simple data model.
Sales Data
Data Model
We can see that we have two distinct transaction tables, Sales and Marketing, which roll up to a single Customers lookup table.  Pretty much all of these could be accomplished using a different function.  However, we always strive to find the easiest solution to understand and implement. That's where LOOKUPVALUE() shines.  Let's look at some options for looking up values across (or within) tables.

Retrieving a Value from the Same Table

Sometimes, you need to retrieve a value from a specific row in the table you're already in.  This is one of the great use cases for LOOKUPVALUE().  For our scenario, we want to compare the Sales for each row in the Sales table to the Sales for January for that same customer.  Let's see how it's done.
January Sales (Formula)
January Sales (Results)
All we have to do is define the value we're looking for, then a unique set of identifiers to identify where we want to look in the target table.  In this case, we're looking for the same [Customer ID], and a fixed [Month Start Date] of January 1, 2016.

Retrieving a Value From Up the Hierarchy

Now, what if we wanted to retrieve a specific value from the lookup table.  In our case, let's say that we want to pull Customer Type down into the Sales table.  We could use LOOKUPVALUE() to accomplish this as well.  However, there's a far simpler way, the RELATED() function.  The RELATED() function uses Row Context to retrieve a value from up the hierarchy.
Customer Type
The RELATED() function is far simpler than the LOOKUPVALUE() function.  All you have to supply is the field you are looking for.  However, RELATED() only works when you want to retrieve a single value from up the hierarchy and requires an active relationship.

Retrieving a Value From Down the Hierarchy

Now, what if we wanted to pull a value from the transaction table into the lookup table.  Ordinarily, this is done using aggregations and the CALCULATE() function.  However, in some cases, you want to retrieve a specific value with no aggregation.  Let's try pulling the Sales for February up into the Customers table.
February Sales
This time, we need to provide Customers[Customer ID] because we are querying from the Customers table.  The rest remains pretty simple.

Retrieving a Value from an Unrelated Table

We tried our best to find an example of this that works in a useful.  Simply put, this is VERY sticky territory.  DAX is built around utilizing relationships.  Therefore, it can become quite difficult to work without relationships.  To summarize, if you are trying to connect two tables without some type of relationship chain, you may want to reconsider your modelling.  If anyone in the comments can find a good example, let us know!

Hopefully, this post opened up your mind to the possibilities of using LOOKUPVALUE().  It's not a unique function by any means, but it sure is easy and intuitive to use.  That's worth quite a bit in our book.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
BI Engineer
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, July 18, 2016

Power BI/PowerPivot: What is Row Context?

Today, we're going to talk about another major component of DAX, Row Context.  In a previous post, we talked about Filter Context, which is supplied by charts, slicers, and measures at runtime.  Row Context is the complement to Filter Context.  Row Context is used when you are creating calculated columns within your data model.  This means that they are calculated at processing time instead of at run time.  Let's start with a simple data model.
Sales Data
Data Model
We have a sales table with a row-level slicer called [Order Number] and a look-up slicer called [Category].  We also have a simple relationship between the Sales and Categories table on the [Category] column.  Let's start by adding a very simple calculated column to the Sales table.
Category 1?
This column simply returns TRUE if Sales[Category] = 1, and FALSE otherwise.  For each row in this table, the [Category 1?] calculation has a Row Context defining what row it is on.  While working within a single table without an iterator, Row Context is quite simple.  It's nothing more than the row you are on.  There are a few ways where this can get sticky.

Retrieving Values from UP the Hierarchy

Let's start by looking UP the hierarchy, i.e. from the Sales table up to the Categories table.
Category Type (Invalid)
If we try to simply pull the value down, we get an error.  This is because, unlike Filter Context, Row Context does not move across relationships.  We have to use specialized functions to accomplish certain tasks.  If we want to pull down a slicer like this, we need to use the RELATED() function.
Category Type (Related)
The RELATED() function is able to pass Row Context up the hierarchy, allowing you to retrieve values from lookup tables.  This is one of the more common ways that we use to pull data across tables.

Retrieving Values from DOWN the Hierarchy

Next, let's try to pull a value from DOWN the hierarchy, i.e. from the Categories table down to the Sales table.  Since we have multiple rows in the Sales table related to a single row in the Categories table, we're going to need to aggregate the values.  Let's try to pull up Total Sales.
Total Sales (Invalid)
Well, this is interesting.  When we first started using DAX, we always assumed that using calculations like this would work just like a pivot table.  If we built a pivot table with Category and Category Type as the columns and SUM( Sales[Sales] ) as the measure, this would work.  However, we now know that pivot tables work off Filter Context, while calculated columns work off Row Context.  It's important to remember that Filter Context can move across relationships, while Row Context cannot.  So, we need a way to turn this Row Context into Filter Context.  This concept is called "Context Transition".  Luckily, this is one of the many uses of the CALCULATE() function.
Total Sales (Calculate)
Voila!  The answer magically appears.  Well, not magically, but it's still pretty cool.  Let's move on to the final method.

Iterators

Iterators are a huge category of functions in DAX that can be used to solve almost any problem.  We could easily do an entire blog series on iterators.  So, we're just going to touch on a very simple iterator to showcase a little more about Row Context.  Let's say that you wanted to create a new column in your table for Running Sum of Sales.  You can order this running sum however you want.  For our purposes, we'll order it by Sales[Order Number].  This means that [Running Sum of Sales] for Order Number 1 is just the sales of Order 1.  The [Running Sum of Sales] for Order Number 2 is the sum of Order Number 1 and Order Number 2's Sales.

Before we can start looking at other rows, we need to be able to escape the Row Context of our current row.  We can accomplish this with the ALL() function.  The ALL() function simply removes all filters applied to a calculation.  Remember that once we wrap our calculation in the CALCULATE() statement, our Row Context becomes Filter Context, meaning that it can be escaped using ALL().
Running Sum of Sales (Incomplete)
We can see that we've removed all Row (and Filter) Context from this calculation.  Next, we need to add more logic so that it only looks at rows with Order Number less than or equal to the Order Number of the current row.  But, how do we access the Row Context of the current row now that we've removed it.  This is where the EARLIER() function comes in handy.  The EARLIER() function can look back at what your Row Context used to be.  This means that we create a filter statement using this original Row Context.
Running Sum of Sales
Now, our calculation is complete.

Hopefully, learning about Row Context has opened your eyes to some of the more interesting possibilities within DAX.  Keep your eyes peeled for more posts on the amazing world of DAX.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
BI Engineer
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, June 27, 2016

Power BI/PowerPivot: What is Filter Context?

Today, we're going to talk about filter context in DAX.  This feature is incredibly simple to understand, yet incredibly powerful.  Before we get started, let's look at a simple data model.
Sales Data
Data Model
As you can see, we have a table of sales data associated to categories, as well as a category table giving us additional information about each category.  Now, let's look at a simple chart using Sales[Category] (Category from the Sales table) and Sales[Sales].
Sales by Category (from Sales)
When you create this chart, you are looking at Filter Context.  Let's look at a single measure value in the table.
Sales by Category (Filter Context)
When we look at this one value (11.76), we have to think about what filters are applied to it.  Since we have a very simple pivot table, and no external slicers or filter, the only filter applied to this value is "Sales[Category] = 1".  Similarly, we can say that the filter context for row 2 is "Sales[Category] = 2".  Let's add another layer to the pivot table.
Sales by Category and Order Number (from Sales)
Here, we have a value with a filter context of "Sales[Category] = 1 AND Sales[Order Number] = 6".

EDIT:  The above filter context should be "Sales[Category] = 1 AND Sales[Order Number] = 8".  Thanks to Benjamin for pointing this out.

 Let's flip this pivot around to see if the numbers change.
Sales by Category and Order Number (Pivotted)
Even though the chart looks different, the filter context remains exactly the same.  Therefore, the value doesn't change.  Let's take this one step further and look at Sales by Categories[Category].
Sales by Category (from Categories)
These are the same exact values we saw earlier.  This is because we have a relationship defined between the two tables.  So, we can say that the filter context is "Categories[Category] = 1", which is the same as "Sales[Category] = 1".  It's very important to note that Filter Context can move across relationships.  What would happen if we tried to go the other way?  Can we apply a filter on the Sales table and have it affect a measure from the Categories table?  First, we need to create a measure in the Categories table.
Categories
Now, we can place this in a Pivot table with Order Number, which is DOWN the hierarchy from Categories.  If the filter context is unused, then Categories should return 4 for every row because there are 4 total rows in the Categories table.  Let's see what happens.
Categories by Order Number
Interestingly, we can use filter context in this direction as well.  The one caveat here is that the measure is no longer additive.  Notice how the "Total" row shows 4 categories?  This is the correct number of Total Categories, but the sum of the Categories column far exceeds 4.  Let's try something else.  Let's add another row to the Categories table for Category 5, even though there are no rows in the Sales table to link to it.
Categories (New)
Now, what will happen if we place categories on a pivot table with Sales[Category]?
Categories by Category (from Sales)
Obviously, we would expect the Categories measure to return 1 for each category.  However, the pivot doesn't show Category 5 because we're using Category from the Sales table, which has an incomplete list of Categories.  This is one of many reasons why it's generally better to use slicers from your lookup table if possible.  

Now that we've seen how filter context is applied via simple charts, let's add a page-level filter to the mix as well.
Sales by Order Number (with Filters) (No Selection)
Since we have not selected anything in the filters, the filter context for the highlighted value is "Order Number = 1".  Let's make a selection.
Sales by Order Number (with Filters) (Category Type = A)
Now, the filter context is "Order Number = 2 AND Category Type = A".  Since Filter Context is the combination of ALL filters applied to the measure, we can get the same result by using visual interactions instead of filters.  (FYI, a slicer in Power BI is technically a measure-less chart.  This means that a slicer interacts with a chart in the same way that two charts interact with each other.)
Sales by Order Number (with Interaction) (Category Type = A)
The final type of filter we want to talk about are defined directly in the measure definition.  The CALCULATE() function can be used to calculate a measure, using a set of pre-defined filters.  For instance, we can create a measure called "Sales for Category 1" that gives us the total sales for only Category 1.
Sales for Category 1
Sales for Category 1 (Chart)
By looking back at the earlier charts in this post, you can see that this value is correctly showing the total sales for Category 1.

We hope this post helped you on the way to understanding how filter context works and how you can exploit it to solve some of your problems.  We've barely scratched the surface here and there's plenty more to talk about.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
BI Engineer
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, June 6, 2016

Power BI/PowerPivot: Using Slicers to Choose Which Measure to Display

Today, we're going to talk about how to use Slicers to choose which measure you want to display in your Table or Chart.  This is probably one of the coolest and easiest features to implement, yet it completely opens your reporting world.  It easily allows the user to select which measure they want to display, without you having to build multiple tabs or charts.  Let's get to it!

We'll start off by building a very simple data model.
Sales Table
We start with a table of orders.  Each order has a few slicer columns (Date, Month, Category) and a couple of measure columns (Sales, Profit).  We can easily pull this into Power BI Desktop and build some simple charts.
Sales Charts
Now, this is great if we only want to show Sales.  However, what if we want to allow the user to select whether they want to see "Total Sales", "Average Sales", or "Total Profit"?  We can achieve this with a little slight-of-hand.
Create Table
Let's start by creating a table with a single column called Measure.  Remember that slicers/filters can only be creating by using existing columns in your data.  When you select a value in the slicer, it filters the underlying table, as well as any tables further down the chain of relationships.  In our model, we don't have any relationships.  Therefore, when we select a value in the filter, nothing happens.
Unconnected Filter
Now, we need to create a measure that changes based on the value of the filter.  This is where the SWITCH() function comes in handy.  The SWITCH() function is nothing more than a simple way to write long IF-ELSE chains.  In this scenario, we want to check whether [Measure] is "Total Sales", "Average Sales", or "Total Profit" and return the appropriate measure.
Invalid Switch
We get an error.  Since there is no relationship between these two tables, Power BI doesn't know how to choose which of the three values in the [Measure] field to use.  There are a couple of ways to resolve this, and they all depend on how you want the measure to act when multiple values are selected in the filter.  The simplest method is simply to wrap Table2[Measure] in a FIRSTNONBLANK() function.  FIRSTNONBLANK() functions very similarly to MIN().  However, MIN() only accepts numeric data, where FIRSTNONBLANK() can accept any data type.
Selected Measure (FirstNonBlank)
Now, all we have to do is rebuild our charts, replacing [Sales] with [Selected Measure (FirstNonBlank)].
Dynamic Measure (FirstNonBlank) (Average Sales)
Since all of the values are between 7 and 8, we know that are looking at Average Sales.  This is what we would expect given that we have chosen Average Sales in our slicer.  However, what would happen if we selected all values in the slicer?
Dynamic Measure (FirstNonBlank) (No Selection)
We see that the charts still display Average Sales.  This is because Average Sales is the first non-blank value in the Measures column (text fields are sorted alphabetically).  What would the charts look like if we selected the other measures?
Dynamic Measure (FirstNonBlank) (Total Profit)
Dynamic Measure (FirstNonBlank) (Total Sales)
We can see that this methodology works fine when an individual measure is selected, but works somewhat awkwardly when more than one is.  Depending on which media you are using to house your reports (Power Pivot via Excel, Tabular via Excel, Power BI Online, Power BI Desktop, etc.), you may have the option of not allowing the user to select more than option in the slicer.  This would be our preferred option.  However, Power BI Desktop doesn't seem to have that option.  So, let's shoot for the next best thing.
Selected Measure (HasOneValue)
The HASONEVALUE() function will return TRUE if the field contains only one value after all filters are applied and FALSE otherwise.  Therefore, wrapping our original Switch inside of an IF() statement with a HASONEVALUE() condition will allow us to control what happens when the user doesn't select a single value.  In this case, we want the charts to go blank.
Dynamic Measure (HasOneValue) (No Selection)
Dynamic Measure (HasOneValue) (Average Sales)
We hope this enlightened you to the possibilities for Dynamic Measure Switching in DAX.  The possibilities of this technique are nearly endless.  Any type of information that can be conveyed via measures, can be altered with this technique.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
BI Engineer
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com

Monday, May 16, 2016

Power BI/PowerPivot: Using Power Query to Reformat a Table

Today, we're going to expand on our previous post.  In the previous post, we built a simple data model using a sample data set.  However, there were a few situations that we cheated through for simplicity's sake.  Today, we're going to reformat the Users table using Power Query.
Users (Original)
 The original users table had 4 rows for the 4 different managers with their corresponding regions.  It also had 4 additional rows for the CEO, Pat.  This formatting would cause huge problems if you were to join or link to this table.  In fact, Power BI prohibits us from doing this.
Cannot Create Relationship
If we were to somehow force Power BI to create this relationship, which we're not even sure we can do, it would cause headaches when we tried to use these fields in our charts and calculations.  We could create a pivot table that would correctly show the sales for each of the 4 managers.  It would also correctly show the sales for Pat as the sum of the sales for all the managers.  However, our total sales should be the sales for Pat; but, Power BI would actually show us twice that as our total sales because it would be summing the 4 managers' sales with Pat's sales to get the total sales.  This is called a "Many-to-Many" relationship and needs to handled in a very different way.  Alas, that's not the purpose of this post.  We want to reformat this table so that it has 4 unique rows (1 for each region), with a column for the Manager of that region, and a column for CEO (which is Pat).  In order to do this, we need to use Power Query to edit the data as it's coming in.

Edit Queries
We begin by clicking the "Edit Queries" button in the Ribbon.  This opens up the Power Query window.
Users (Bad Headers)
The first thing we notice is that the headers did not come through properly.  We can easily fix that by clicking the "Use First Row as Headers" button in the Ribbon.
Use First Row As Headers

This fixes the headers, but we still have the issue with Pat's rows.
Users (Good Headers)
Let's start by duplicating this query by right-clicking on Users in the left Query panel, and selecting the "Duplicate" option.  This will create a Users (2) table that is identical to the Users table, for now.

Next, let's summarize the Users (2) table to show the Number of Regions per Manager.  We can accomplish by clicking the "Group By" button the Ribbon.
Group By
Next, we choose to create the "Number of Regions" column via grouping by "Manager" and using the "Count Rows" calculation.
Number of Regions
Now, our Users table shows the Number of Regions for each Manager.
Users (Number of Regions)
Now, we need to be able to identify who the CEO is and who the regular managers are.  This table is great for that because we know that anyone with a "Number of Regions" equal to 1 is a manager, while the user with "Number of Regions" equal to 4 is a CEO.  Next, we need to join/merge these two tables together.  We can do that by clicking the "Merge Queries" button in the Ribbon.
Merge Queries
Now, we need to select the table we would like to join to, as well as the column we would like to join on, and the type of join we would like to perform.  Since we only have 2 tables, we want to join Users to Users (2).  We also have to join on Manager because that is the only column they have in common.  Finally, we have quite a few options for which join we perform.  Since we created the Users (2) table from the Users table with no filters, we know that every manager in Users is also in Users (2), and vice-versa, that means that Inner Join, Left Outer Join, Right Outer Join, and Full Outer Join will give us the same result.  For simplicity, let's just choose Inner Join.
Merge
Now, we have the information from Users (2) in the Users table.
Users (Merged, Dirty)
Currently, we can't see the information that was merged into this table.  So, we need to expand the "Table" by clicking on the "Double Arrow" at the top of the NewColumn and selecting the "Number of Regions" checkbox in the "Expand" panel.  Also, it's cleaner if you deselect the "Use Original Column Name as Prefix" option.
Expand Number of Regions
This leaves us with a table that looks more traditional.
Users (Merged, Clean)
Now, we have a way to identify who is a CEO and who is a Manager.  So, let's create a calculated column that shows that.  We can do this by selecting the "Add Custom Column" button in the "Add Column" Ribbon.
Add Custom Column
Next, we need to define our calculation.  The language behind Power Query is formerly known as "M".  You can find a formula library here.  We want to create a calculation called "Level" that returns "CEO" if "Number of Regions" equals 4, and returns "Manager" otherwise.
Level
This calculation adds the "Level" column to the Users table.
Users (with Level)
Here's where we get to the tricky part.  We want to transform this table so that it contains a column for Manager and a column for CEO.  The manager column should have the appropriate manager based on the region, while the CEO column always has Pat.  We can accomplish this by selecting the "Level" column, then clicking the "Pivot Column" button in the "Transform" Ribbon.
Pivot Column
Now, we've already told Power Query that we want to create columns based on the values in the "Level" column.  However, we still need to tell it what values should go in the new columns.  We want the new columns to take the non-aggregated values from the Manager column, i.e. the Manager Names.
Pivot Level with Manager
Now, our table has quite a few holes.
Users (with Holes)
Most programming languages, including SQL, will aggregate values when you perform a pivot to remove holes like this.  Unfortunately, Power Query does not.  There are a number of ways we could accomplish this.  We think the easiest way is to select the "CEO" column, then select the "Fill Up" button in the "Transform" Ribbon.
Fill Up
This will fill in the gaps in the CEO column.  We could use "Fill Down" to fill the gaps in the Manager column as well.  However, we don't need to.  We're only interested in having a single row for each Region, with the corresponding Manager and CEO listed.  Looking down the list, we see that we already have these 4 rows available.
Users (Filled)
So, all we have to do is filter the "Number of Regions" column to only include 1.  We can do that by clicking the Triangle at the top of the column, and deselecting the "4" box.
Filter Number of Regions
Now, our table looks almost ready.
Users (Filtered)
All that's left to do is the remove the "Number of Regions" column because we no longer need it.  We can do that by selecting the "Number of Regions" column and clicking the "Remove Columns" button from the "Home" Ribbon.
Remove Columns
Now, our table is complete and we can use it to further our analysis.
Users (Clean)
The interesting thing about this procedure is that it can likely be done in a myriad of other ways.  Power Query even has an advanced editor where you could probably code an approach to handle this automatically.  We definitely didn't take the easy way out for accomplishing this.  To see that, check out our previous post.  All in all, we feel this approach is pretty flexible, but lacks some of the more advanced features.  For instance, what if there were 3 levels of hierarchy in our instead of just the 2 we had here (Manager and CEO)?  Could we create some sort of iterative process to accomplish this?  Maybe someone in the comments can help us out.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
BI Engineer
Valorem Consulting
@BreakingBI
www.linkedin.com/in/bradllewellyn
llewellyn.wb@gmail.com