Wednesday, February 20, 2013

Creating a Top N filter with an "Other" Category in Tableau

Today, we will create a Top N Filter that also displays an "Other" category.  This can be extremely useful when you want to see how the Top N compare to the rest of the population.  As usual, we will use the Superstore Sales sample data set in Tableau.

Step 1:

  • Create the following calculated fields:
Determines which values are in the Top 20

Determines which value will start the "Other" category

Returns true for the Top 20 and the first value in the "Other" Category

Returns SUM( Sales ) for the Top 20 and returns the remaining sum for Other

Return Customer for the Top 20 and Other for the 21st
Step 2:
  • Rows Shelf: "Customer", "Top 20 Customer Label"
  • Remove Header for "Customer"
  • Text Shelf: "Top 20 Sales"
  • Filters Shelf: "Top 20 + Other" = TRUE
  • Sort "Customer" Descending by Sum of "Sales"
  • Edit Table Calculation "Top 20 Sales" using "Table (Down)"
Top 20 Customers by Sales with Other
Now, you have created a top 20 filter with an "Other" category.  As you may have noticed, this is free of any dimension constraints (other than the label).  So, you can drag any other dimensions you want onto this graph as long as you create a new label for it.  Also, you can change your measure by altering "Top 20 Sales" and changing the measure that you sort by.  Try using this with more complicated metrics, or adding a parameter.  There's so much more to do here, have some fun.  I hope you found this informative.  Thanks for reading.

Brad Llewellyn
Associate Consultant
Mariner, LLC
llewellyn.wb@gmail.com
https://www.linkedin.com/in/bradllewellyn

29 comments:

  1. Brad, another great post.
    However since tableau 8 is out there, is there an easier way using sets maybe to accomplish the same task?

    ReplyDelete
    Replies
    1. Looking at the "Sets" menu in Tableau 8, it appears to be identical to the one in Tableau 7. This would lead me to believe this is still not possible with Sets. However, if you are any of the other readers know of a way, let us know!!! Thanks.

      Delete
  2. It's a great post! I'm trying to draw a pie chart with these data but I can't. Is that impossible or Is there any problem in my data set?

    ReplyDelete
    Replies
    1. It's a little more involved in a pie chart, but still possible.

      http://speedy.sh/dRph4/Book2.twbx

      Thanks,

      Brad Llewellyn
      Associate Consultant
      Mariner, LLC
      brad.llewellyn@mariner-usa.com
      http://www.linkedin.com/in/bradllewellyn
      http://breaking-bi.blogspot.com

      Delete
    2. "Angle" mark doesn't appear in my workbook. Where can I get it from?

      Delete
    3. I used

      IF [Top N?] THEN SUM( [Sales] )
      ELSEIF [Other?] THEN WINDOW_SUM( SUM( [Sales] ), 0, LAST() )
      END

      With Compute Using on "Sub-Category"

      Delete
    4. Thank you, but I cannot apply pie chart neither in my workbook nor yours.
      I cannot get the same results with your pie chart. 'Top N other' isn't working. All brands(or categories in your workbook) are apparent.
      Could you please explain steps that you follow?
      Thank you very much

      Delete
    5. 1) Create "Top N?"

      INDEX() <= [Top N]

      2) Create "Other?"

      INDEX() = [Top N] + 1

      3) Create "Top N or Other?"

      [Top N?] OR [Other?]

      4) Create "Top N Sales"

      IF [Top N?] THEN SUM( [Sales] )
      ELSEIF [Other?] THEN WINDOW_SUM( SUM( [Sales] ), 0, LAST() )
      END

      5) Create "Top N Sales (% of Total)"

      [Top N Sales] / WINDOW_SUM( [Top N Sales] )

      6) Create "Top N Sub-Categories"

      IF [Top N?] THEN ATTR( [Sub-Category] )
      ELSEIF [Other?] THEN "Other"
      END

      6) Filter on "Top N or Other?" = TRUE

      7) Add "Top N Sub-Categories" to the Colors Shelf

      8) Add "Sub-Category" to the Details Shelf

      9) Add "Top N Sales" to the Angles Shelf

      10) Add "Top N Sales (% of Total)" to the Labels Shelf

      Delete
    6. Iam using essbase as my data source, In this case how do i create
      Top N% since im not having SUM function in Multidimensional(essbase) database.

      Delete
    7. Addy,

      Thanks for commenting. If you are connecting to a multidimensional source, you should be using basic aggregations, SUM(), COUNT(), etc. The source does all of that work for you. However, you still have access to the table calculations, WINDOW_SUM(), WINDOW_COUNT(), etc. Does this make sense?

      Delete
    8. yeah thanks Brad. I got it solved.
      I have another problem in creating Forecast reports, where
      'show forecast' option is disabled in my case (for multidimensional source).
      I could not make use of tableau's forecast features unless I change the data source to excel,csv or anything else other than Multidimensional source.
      I have tried doing all the possibilities but could not just get it done.
      (Currently am taking the average of all the previous years sales amount and displaying it as the next year forecast.)
      can u please suggest me some ideas to resolve it.

      Delete
    9. Addy,

      I don't think Tableau's built-in Forecasting options work against a multidimensional source. You have a few options here.

      1) Put some sort of forecasting algorithm in the data source as another measure.

      2) Create some type of table calculation (like you're doing now)

      3) If you're using Tableau 8.1, you can use R to create forecasts for you. Check out this blog series on the topic.

      http://breaking-bi.blogspot.com/2014/02/predictive-analytics-in-tableau-part-6.html

      Delete
  3. How does the approach change if you want to show several months and want to rank each month? For what I am working on, I can get the rankings to work within each month, but writing the "Other" segment and getting it to work for each month is something I can not figure out.

    ReplyDelete
    Replies
    1. If I understand your question correctly, then you will need to change the Compute Using of the table calculations in order to account for Month as well. If you need more assistance than that, would it be possible for you to email me a mock-up or .twbx file that I could look at?

      Delete
    2. Thanks, I will play around with it more, but may contact you should I still not be able to figure this out.

      Delete
  4. Brad - this is very helpful. I also would like to do a Pie chart, but the twb file is not available anymore. Can you repost or point to a link describing? Thanks!

    ReplyDelete
    Replies
    1. Thanks for commenting! In the last step, when you are setting the "Compute Using", you should change it from "Table (Down" to the field you are slicing on, "Customer" in my example. For more information on Compute Using, you can check out

      http://breaking-bi.blogspot.com/2013/07/introduction-to-table-calculations.html

      Delete
  5. Brad, I know this is an old post and I do use sets now to accomplish the same thing, but I have one question on this method: It is the index() = 21 for Other. To me it is not intuitive that Other would capture customers 22, 23, 24 etc because it is set to 21 or less on the filter shelf. I see that calculated field used window_sum to the last row, but if the index filter is limited to 21 how can it get to the last row of data (or customer)? Is there an easy way to explain what is occuring?

    ReplyDelete
    Replies
    1. John,

      Thanks so much for commenting! The technique revolves around a single rules. ALL Table calculations are calculated at the same time. This means that the WINDOW_SUM() and the INDEX() are calculated together. After all of those calculations take place, the data set is filtered on INDEX(). This means that you can calculate a value, WINDOW_SUM() in this example, before the data is filtered. Does this make sense?

      Delete
  6. Just wanted to say thanks for this useful post. I did not get the results I wanted when following Tableau's post about using Sets because they rely on a Top N filter which, from what I understand, occurs before any other filters. This post here avoids that problem (except when using a filter on a secondary blended data source, it seems.)

    ReplyDelete
    Replies
    1. Geoff,

      Thanks for the comment. Glad to hear that you found it useful.

      Delete
  7. Hi Brad and everyone
    ,
    I have a little different requirement.
    Considering the superstore dataset, I want to list of top 20 customers by sales where product is furniture and compare sales of these 20 people in technology.
    So, there can be cases where someone only have sales for furniture so technology for that guy will be null.

    https://public.tableausoftware.com/views/TopBottom20_2/Sheet4?:embed=y&:display_count=no

    Please see the attached link. It has top 20 people in furniture. For these 20 people I want their sales for technology.

    -Nishant

    ReplyDelete
    Replies
    1. Nishant,

      Thanks for commenting. One of the easier ways to do this is to create two calculated fields.

      SUM( IF [Category] = "Furniture" THEN [Sales] END )

      and

      SUM( IF [Category] = "Technology" THEN [Sales] END )

      Then, you can create follow the same methodology to create a top 20 filter for each category individually. Then, you can combine then like

      [Top 20 Furniture] OR [Top 20 Technology]

      and filter on that value. That would show you the top 20 people for each category. Does this help?

      Delete
  8. Hi Brad,

    Thanks for your helpful posts. Is there a way to add the count of customers making up the other bar to the label? For example, I'd like the label to say others(100) if there were 100 more customers.

    ReplyDelete
  9. Hi Brad,

    Love to see your apply your trick on my solution and its working perfectly. but i want to obtain percentage of Top 20 and others. please share your idea with us. I just want to modify your trick and don't want to create any other calculated field or parameter.

    Thanks in Advance!!!

    ReplyDelete
  10. Thanks Brad! The perfect solution for my enterprise wide Tableau reporting application that is used by over 500 people. I've been trying to figure out how to do this for years. Your solution continues to work even when I added additional metrics and dimensions. Thanks again, David

    ReplyDelete
  11. Thank you Brad for this post! It helped me a lot! :)
    I have an issue..when I want to see in the Chart only 'Others' row, right click on others and choosing 'Keep only', it will show me one row, and the name of the value which was on the rank()=21 place and value for that Customer (as it is now on the first place and enter into filed of Top 20). Do you know how to keep "Others" and values for Others?

    ReplyDelete
  12. can this trick work (top N + Others) within a category, eg. top 20 customer and others for each State?

    ReplyDelete