Monday, August 26, 2013

Calculating Top and Bottom N% of Customers

Today, we will be calculating top and bottom N% of customers.  We had a similar post about this back in June, Top N% Filters in Tableau.  However, this post has two purposes.  First, we cleaned up the display and functionality to be more enticing to business users.  Second, we would like to raise awareness about our webinar series on Table Calculations in Tableau.

The third and final webinar in the series will be held Thursday, August 29 at Noon EDT.  You can sign up here.  The webinar will focus on using Table Calculations to answer any questions we can come up with.  Feel free to submit scenarios to us via email.  The only requirement is that the scenario be able to be replicated using the Superstore Sales sample data set in Tableau, which is also what we will be using for this post.  This example appeared in the last 10 minutes of so of the session 2 webinar, which can be found here (once it gets published).

Step 1:
  • Create a chart of SUM( [Profit] ) per [Customer]
  • Sort the [Customer]s in descending order of SUM( [Profit] )
Profit by Customer
Step 2:
  • Calculate each [Customer]'s rank in the ordering, i.e. the index.
  • Create the following calculated field.
  • Force "Compute Using" to use [Customer]
Customer Rank
Now that we know what each customer's rank is, we need one other value to calculation top/bottom N%.  We need to now the total number of customers.

Step 3:

  • Calculate the total number of [Customer]s
  • Create the following calculated field.
  • Force "Compute Using" to use [Customer]
Number of Customers
Now, let's make sure that these calculations are working as we intended.  The Labels Shelf is great for this.
Customer Ranking Test
These values are exactly what we were looking for.

Step 4:

  • Create a parameter to store our N%.  For ease of use, 5% should be 5, not .05.  We will handle the mathematics inside the calculation.
Top/Bottom N%
Here, we could just calculate a simple True/False filter for the chart.  However, let's get slightly more advanced.

Step 5:

  • Create a field that labels the Top N% as "Top N%" and the Bottom N% as "Bottom N%", with the value of N placed into the field.
  • Create the following calculated field.

Top/Bottom N% of Customers
This field may look complex, but it only does two things.  First, if the customer is in the Top N%, it labels them "Top N%".  If that fails, then if the customer is in the Bottom N%, it labels them "Bottom N%".  The rest is just syntax to accomplish this.  You may wonder why there is a blank space before Top, but not before Bottom.  This is due to the fact that there are only two ways to sort string fields in Tableau, data source order and manual.  The data source order in this case is alphabetical, which would put Bottom before Top.  Therefore, we added a space to make Top come before Bottom.

Step 6:

  • Add this field to the Rows Shelf before [Customer]
  • Filter the chart using this field
Be careful when you do this.  Since the labels change via a parameter, you may run into issues if you choose to filter on the fields directly.  Instead, choose to exclude NULL.  This is also the same reasoning why we couldn't use Manual Sorting to place Top before Bottom.
Top/Bottom N% Filter
Finally, you have a pretty chart which shows the user which Customers are profitable and which ones are not.
Top/Bottom N% Chart
There's so much more to do here.  Try allowing the user to filter for Top 2% and Bottom 1% (Hint: use multiple parameters).  We hope you found this informative.  Thanks for reading.

P.S.

Remember to sign up for the webinar!  There will be plenty of cool stuff to see.

Brad Llewellyn
Associate Data Analytics Consultant
Mariner, LLC
llewellyn.wb@gmail.com
https://www.linkedin.com/in/bradllewellyn

17 comments:

  1. Hi there, is there any way to display top 5 and bottom 5 Customers instead of displaying it in percentage?

    ReplyDelete
    Replies
    1. Absolutely, the code is much simpler.

      IF FIRST() > -5 THEN " Top 5"
      ELSEIF LAST() < 5 THEN "Bottom 5"
      END

      Delete
  2. Thank you so much for this blog post. In the visualization that I am working on I wanted to show top N and bottom N locations on a map and your tutorial is exactly what I needed.

    ReplyDelete
  3. Brad, the above is just the stuff i needed. One modification query. What if I add another dimension, like, I need Customer and then Country in the rows. I tried to play around with the formula, but not able to. Can you please help!!!

    ReplyDelete
  4. Brad, the above is exactly what I needed. Any help on the below 2 queries?
    1) How to get the top/bottom based on Profit, and not customers.Eg: Top 30% of profits
    2) When we add a second dimension, like add Country after Customer name in Rows, the numbers go haywire.Any way to make it work/

    ReplyDelete
    Replies
    1. Dheer,

      Thanks for commenting! Perhaps I'm misunderstanding your question. This post allows you to find the top/bottom based on Profit. Could you rephrase that question? As for your second question, if you start adding more dimensions, you need to account for the "Compute Using" feature. You can refer to the following link for more info.

      http://breaking-bi.blogspot.com/2013/07/introduction-to-table-calculations.html

      Thanks,

      Brad Llewellyn

      Delete
  5. Brad,
    I was so happy of finding this solution, that i posted my queries without researching a bit. I have my answers now.

    Thanks for taking the time to put up these very informative blogs.

    ReplyDelete
  6. Hi Brad - Great tutorial...definitely helpful!
    I've couple of questions:
    (1) Assuming the profits (or in my case scores) are always positive (>0), how to show the Bottom N as negative or on negative axis?
    (2) How to color code the displayed Top/Bottom customers based on categories/regions?

    Thanks,
    Jag

    ReplyDelete
    Replies
    1. Jags,

      Thanks for commenting.

      (1) Once you complete everything shown above, you add one additional step. Instead of using Score in the chart, you use a calculated field

      IF LEFT( [Top and Bottom N%], 1 ) = "B"
      THEN -[Score] ELSE [Score] END

      (2) I colored my bars based on profit because it made it look nicer, you are free to use anything you want to color your bars, it will not affect the way the chart is calculated.

      Delete
    2. Hi Brad,

      Is there a way to choose Top N values from the drop down. For example, when we select Top 50, dashoboad should show only Top 50 values etc. User should have a drop down to select Top N values i.e 50 to 1000 from the drop down. Please suggest me how to achieve this.

      Thanks in advance.

      Delete
    3. Janu,

      Thanks for commenting! This post should help you achieve that.

      http://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/parameters_filters.html

      Delete
    4. Brad,

      Thanks for your quick response. I am able to show Top N values in the "Slider" or "Type In" with 10 step size. Is there a way we can have a Drop down for Top N with List of Values like 50,100,150,200 etc.

      Thanks in advance.

      Delete
    5. Janu,

      Edit the parameter and change "Allowable values" to List

      Delete
  7. Brad,

    Thank you very much. I got it what exactly I required.

    ReplyDelete
  8. Brad,

    Great work. Is there a way to compare the aggregate of the bottom or top N% vs an individuals sales or score? I seem to run into trouble with this.

    ReplyDelete
    Replies
    1. Chris,

      This is certainly a possibility. Let's assume that you have successfully flagged Top N% like we did in the above post. Now, if you use another window function, like WINDOW_AVG(), you can aggregate those Top N% values. Next, you compare a particular value to the WINDOW_AVG() value like follows.

      [Value] - WINDOW_AVG( IF [Top N%] THEN [Value] END )

      Does this help?

      Delete