Monday, November 25, 2013

Tableau vs. Power Pivot Part 11: Introductory Mapping

Today, we will talk about mapping in these tools.  Mapping is extremely important because it allows the user to see geographic patterns that may be missed if you were to use a more traditional chart type, such as a bar graph.  We will use a variety of sources in this post to illustrate the effective use of each of these types of charts.  For those of you that read our previous post on this topic, Tableau vs. Power Pivot Part 2: Basic Charting, you will remember that Power Pivot does not have any built-in mapping capability.  We also showed that Power View's mapping is somewhat lack-luster.  Therefore, we will introduce the final piece of Microsoft's "Power BI" toolkit, Power Map.  Power Map, formerly GeoFlow, is a new add-in for Excel 2013 that allows users to map their data in a variety of different ways.  The question remains, "How does it compare to Tableau?"  Let's find out.

Map 1: Choropleth

A Choropleth, a.k.a. Filled Map, is a map that is separated into distinct regions, where each region is colored according to a numeric value.  So, let's look at Profit by State, using our Superstore Sales sample data set from Tableau.  First up is Power Map.
Choropleth (Power Map)
As you can see, this map looks pretty good.  Power Map automatically connected to the underlying Power Pivot source, which could have been a big hassle.  Our big concern is that we had to look individually at 8 different background themes to get one that didn't overshadow the data.  On this same note, the map is filled with only one color, which make analyses on something like profit somewhat more difficult.  Let's see how Tableau does.
Choropleth (Tableau)
Tableau seems much cleaner to us.  The color scheme is much better for this case, primarily because Montana had a negative value, which we never saw in the Power Map version.  As far as ease of use goes, Tableau was simply a drag-and-drop experience, while Power Map forced us to look through a bunch of different styles before we found one that looked decent..  Therefore, we have to give this one to Tableau.

Winner: Tableau

Map 2: Heat Map

A Heat Map is a map that is not separated into distinct categories, yet is colored by how many values appear close to each location.  It is very good for finding "hot spots" among countable data.  So, let's look at John Snow's Cholera data.  For those of who have never heard of this map, it is arguably one of the most influential maps of all time.  Feel free to Google it if you are curious.  First, let's see how Power Map handles it.
Heat Map (Power Map)
This map looks really nice.  We can easily see where the deaths occurred and identify a couple of hot spots.  The default color scheme is also very nice as well.  This map did take a little bit of tinkering to get right, but it was well worth it.  Let's see what Tableau can offer up.
Heat Map? (Tableau)
To our knowledge, Tableau does have any built-in functionality for a geographic heat map.  The best we could come up with is the circle approach seen above, which really isn't any better than Power View could have done.  It still shows us the hot spots, but not as clearly as the Power Map version.  We also cannot easily distinguish when there are multiple sets of moderate values in close proximity.  This map wasn't too difficult to create, but it utterly fails in comparison to an actual heat map.

Winner: Power Map

Map 3: Journey Map

A "Journey Map" as we are calling is a map depicting the time-based travels of a person or thing.  It is great for seeing how efficient your shipping methods are or any other type of analyses surrounding that type of data.  In order to show this, we will use Napoleon's March to Moscow.  This is another very famous set of mapping data that depicts the catastrophic defeat Napoleon suffered when trying to march on Russia, originally created by Charles Joseph Minard.  We thought that this map would be great for examining the true mapping power of these tools.  This data, and the Tableau workbook used for this exercise, were originally compiled and created by Kim Rees of Information Aesthetics.  You can view her work on their website and on Tableau Public.  Now, let's see how Power Map handles this.
Journey Map (Power Map)
We tried to find any way at all of connecting these circles and we came up empty.  So, this is the best we could get from Power Map.  If we were inclined, we could replicate each of the rows in the data while slowly moving the locations so that we could fake a line.  However, that's quite a bit of work that most business analysts wouldn't have the knowledge or tools to pull off easily.  This map is ok.  It shows a general trend of the circles getting smaller as Napoleon marches to and from Russia.  Alas, the lack of a line pretty much ruins this graph's analytical capabilities.  Let's see what Tableau can do.
Journey Map (Tableau)
This map is so much more intuitive that the Power Map version.  We will concede that this map did take a little bit of Tableau knowledge to create.  However, the results are well worth it.  We can easily see how Napoleon was losing more and more troops as his journey went on.  Also, we removed the two branching routes from this view because they made the Power Map chart even harder to read.  This is what the original map looked like.
Napoleon's March on Moscow (Tableau)
It's pretty easy to see that Power Map is not very good at creating this type of map.

Winner: Tableau

Bonus!

Both Tableau and Power Map have the ability to create maps that change at regular time intervals.  These time intervals could be very short or quite long, it's up to the user.  Power Map's timing feature requires a datetime field to be present in your cube, whereas Tableau can work on any discrete field.  On the other hand, Tableau only has 3 speed settings while Power Map lets you explicitly define the duration of the show.  To be honest, we've never seen a situation where either of these features were put to good analytical use.  They are great features for "WOW"ing an audience, but not much more than that.  Also, Power Map has the ability to save virtual demos of your map as video files, which can be shared with other people.  Nevertheless, we chose not to let these features impact our experiment.

Also, an anonymous reader who works for a Canadian company pointed out that Tableau doesn't support the names of Cities/States/Provinces, etc. outside of the United States.  Unfortunately, we cannot confirm nor deny this comment.  If you know of any specifics to this situation, please let us know in the comments.

EDIT:  A different anonymous reader posted this link that shows Tableau saying that they support many different regions/states/etc. outside of the United States.  If you find that your cities is small enough that it is not in Tableau's database, then you should easily be able to geocode your city into the data file.

It should be noted that Tableau uses a static geocoding source, whereas Power Map can pull it's geocoding directly off of Bing Maps.  However, Tableau does allow the user to create custom geographic dimensions.  You can supply Tableau with the dimension values, latitudes and longitudes.  Then, you can use that geographic dimension on any chart you want.

EDIT:  A curious commentor questioned whether or not a security risk is posed when Power Map queries Bing Maps for the mapping information.  Is any of the data left vulnerable?  Could a clever snooper intercept this?  Let us know in the comments.

Summary

We saw that Tableau beat Power Map in its ability to create Filled Maps because of it's cleaner aesthetics.  However, Power Map fought back by creating a truly inspiring Heat Map that Tableau was unable to replicate.  Finally, Tableau showed its true colors by creating a Journey Map that Power Map could not compete with.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
Associate Data Analytics Consultant
Mariner, LLC
brad.llewellyn@mariner-usa.com
http://www.linkedin.com/in/bradllewellyn
http://breaking-bi.blogspot.com

Monday, November 18, 2013

Tableau vs. Power Pivot Part 10: Introductory Data Cleansing

Today, we will talk about Data Cleansing.  In short, data cleansing is the process by which you take raw data from a source and prepare it for end-user consumption.  Source data can come in all sorts of unusual formats.  The first step of the analytical process is to make sure that your data is "clean",  as it is being called.  As usual, we will use the Superstore Sales sample data set in Tableau, as well as a table from Wikipedia.

Category 1: Inconsistent Text Labels

This is an extremely common problem when combining data from disparate sources or sources with poorly regulated user entry.  In this example, we have taken the State names in our data set and changed some of them to the 2 letter abbreviation.  For example, some rows will say California, while other rows will say CA.  We want all of these rows to be categorized under the same state.  How do we do this?  First, let's see what the data looks like.
Sales by State (Dirty)
As you can see, we need some way to collapse these varying state names into the proper categories.  In order to do this, we need to create a table that maps CA and California to California, FL and Florida to Florida, and so on.
State Dimension
Now, let's see what Power Pivot can do with this.
Joining States to Orders (Power Pivot)
Technically, this is a dimension because it links directly to our Orders Table.  However, in a true Star Schema, this table would link to the State Dimension.  The process of adding dimension tables onto dimension tables is called Snowflaking.  Finally, let's check out the results.
Sales by State (Clean) (Power Pivot)
We simply use the clean states instead of the dirty ones, and Power Pivot takes care of the rest.  As you can see, there are no abbreviated states in the table on the right.  Now, let's try this in Tableau.  There are two ways we could do this.  We could use Tableau's Drag-and-Drop Joining feature or we could use Data Blending.  Since we used blending in a previous post, we will use joining here.
Excel Connection (Multiple Tables) (Tableau)
First, we need to define our Primary table, which is usually our fact table.
Add Table (Tableau)
Then, we link the primary table, Orders, to our secondary table, States.  Note that no coding was done here.  We simply selected options in the menus and clicked "Add."  Finally, let's see our results.
Sales by State (Tableau)
Just like Power Pivot, we see that we have no abbreviated state names in our list.  This was a pretty easy comparison because these tools used almost precisely the same technique.  There was only one major difference.  The process in Tableau uses upfront processing power to join the data sets, whereas Power Pivot must calculate that join every time we use that field.  However, we have no evidence to say that this process costs any significant processing power.  If this were done on an extraordinarily large scale, would it cause a serious difference is processing time?  We do not know for sure.  Alas, we cannot sway our decision based off of pure speculation.  Therefore, we are forced to call this a tie.

EDIT: Zen Master Jonathan Drummey pointed out that you can achieve similar results using "Aliases" within Tableau.  In fact, we'd recommend using this method instead of the join for a couple of reasons.  First, it doesn't require the data to be in the same source, which is required for joining.  Second, this way allows you to maintain the groupings within the Tableau workbook itself, instead of an external Excel file.  Just goes to show that with these versatile tools, there's always another option you can try.

Winner: Tie

Category 2: Pivoting

We often see data that is in a reporting style tabular format with measures across the columns.  However, there are some times when you think that data would be better represented as a single measure with an added dimension.  This is called pivoting.  Let's look at the following table headings from a data set from Wikipedia.

Crime Table Headings
As you can see, there are 7 different types of crimes listed here.  However, in it's current format, how would you aggregate certain types of crimes?  For instance, how would you get the total number of crimes?  You would have to individually type the names of every single column, then sum that total.  Heaven forbid you try to do something complex such as a standard deviation of all crimes.  That would be a nightmare to create.

So, what we really want to do is pivot these crimes into two columns, Crime and Cases / 100k, which is the unit of measure of these columns.  We could possibly do this with Excel.  However, it would be a somewhat complicated task.  Therefore, we are going to introduce another free add-in for Excel 2013 called Power Query.  Power Query specializes in data transformation and manipulation.

You might say "Why are you talking about Power Query in a Tableau vs. Power Pivot examination?"  Power Query is part of Microsoft's Power BI set, which also includes Power Pivot, Power View, and Power Map.  It's also a free tool for use with Excel 2013 and is great for handling these types of issues.  Now, let's see how we would do this.
Unpivoting Crimes
We simply select all the crime columns and right-click, then select "Unpivot Columns."  Power Query does the rest of the work for us.  After a little more cleaning, such as changing column names and types, we have the following:
Crime (Clean)
If you look on the right side of the window, you can see a list of steps we took.  We can backtrack any number of steps if we wanted to.  We could also save this query and run it again at a later time.  Now, this data set is now ready for use in our tools.  When we save this table, it will be saved as a basic Excel table.

There is a similar add-in for Excel that comes from the Tableau community.  It is commonly known as the Tableau Data Shaping Tool.  It's basic premise is to take a "Report-Style" data set and turn it into a "Table-Style" data set.  It has three basic features.  First, it can pivot.
Pivot Data (Tableau Data Shaping Tool)
It can also copy data from a pivot table, with no formatting, which can also be done via Paste Values.  Lastly, it can fill down, which can be immensely helpful
Fill Down (Tableau Data Shaping Tool)
As you can see, this tool is useful, yet very rudimentary in its design.  It is no match for the shaping power of Power Query.  But, since neither of these tools play any significant amount of favoritism towards Tableau or Power Pivot, we cannot declare a winner.  We recommend downloading both of these add-ins because they can be extremely useful at times.

Conclusion

For some of the basic data cleansing procedures, such as typecasting and adding snowflaked dimensions (The procedure in the first section), Power Pivot and Tableau are about equal.  However, once your needs get more advanced, it is advised that you look for a more powerful tool, such as Power Query.  It is important to note that Power Query publishes to Excel 2013.  Therefore, we are limited by the 1 million row limit.  If you need to surpass the 1 million row limit, you will need to look elsewhere.  Thanks for reading.  We hope you found this informative.

EDIT: Chris Webb pointed out that Power Query can publish straight to the Power Pivot data model.  This means it can bypass Excel's 1 million row limit if you need.  This is just the tip of the iceberg for Power Query.  It can do so much more!

Brad Llewellyn
Associate Data Analytics Consultant
Mariner, LLC
brad.llewellyn@mariner-usa.com
http://www.linkedin.com/in/bradllewellyn
http://breaking-bi.blogspot.com

Monday, November 11, 2013

Tableau vs. Power Pivot Part 9: Introductory Time Intelligence

Today, we will talk about utilizing Time Intelligence within both of these tools.  Time Intelligence is probably the most common category of KPIs.  Everybody wants to know how they compared against last year, last quarter, the last three months, etc.  For this reason, both of these tools have built-in functionality for these calculations.  However, we will limit this investigation to the introductory problems, as we are calling them.  In later posts, we will get into the more complex Time Intelligence situations.  As usual, we will use the Superstore Sales sample data set from Tableau.

Category 1: Three Month Moving Average

The moving average is arguably the most common time intelligence calculation.  Let's see how Power Pivot handles it.
Three Month Moving Average of Sales (Power Pivot)
We won't go into a long explanation of this procedure.  If you are interested, a colleague of ours, Javier Guillen, wrote a nice blog post about it here.  This calculated field leverages the time intelligence functions DATESBETWEEN(), FIRSTDATE(), LASTDATE(), and PARALLELPERIOD().  Unfortunately, there is no built-in function for moving averages in Power Pivot.  There's likely a way to get Excel to do this instead of using Power Pivot.  However, that would add another level of complexity to this calculation.  Also, it would be more difficult to get that data in Power View if you wanted to use it in a chart.  Alas, we have wandered.  Let's check out our final values.
Three Month Moving Average of Sales by Month (Power Pivot)
These values look good.  This calculation seems to work exactly how we wanted it to.  The major downside is the complexity of the calculation.  Next, let's check out Tableau.
Three Month Moving Average of Sales (Tableau)
This function is one of the many "Quick Table Calculations" within Tableau.  We reached this menu in four clicks, which makes this far easier than Power Pivot.  Also, this is a menu-driven process with no coding involved.  That's a huge plus in our book.  For more information about "Quick Table Calculations" check out the earlier post in this series, Tableau vs. Power Pivot Part 5: Point-and-Click KPIs.  Let's see how the output looks.
Three Month Moving Average of Sales by Month (Tableau)
These are the same numbers we saw in Power Pivot, albeit in 30 seconds as opposed to 5 minutes.  This is a no-brainer.

Winner: Tableau

Category 2: Year over Year Growth

Year over Year Growth, also known as "This Year vs. Last Year % Î”", is one of the other extremely common KPIs.  Let's see how Power Pivot handles it.
Year over Year Growth (Power Pivot)
This calculation is significantly less complex than the previous one.  It leverages the SAMEPERIODLASTYEAR() function which is designed to do exactly this.  The main thing that we didn't like was the fact that this calculation returned errors for the first year.  Therefore, we had to wrap in an IFERROR() function.  All in all, this was not a complex task.  Let's check out the results.
Year over Year Growth by Month, Year (Power Pivot)
These values look reasonable.  Next, let's compare it to Tableau.
Year over Year Growth (Tableau)
Once again, a Quick Table Calculation saves the day.  In fact, we didn't have to use any other menus this time.  All we did was click a few times and the answer popped right out at us.  Let's check out the results.
Year over Year Growth by Month, Year (Tableau)
These values are the same as the ones we got from Power Pivot.  This example is exactly like the last one.  Tableau edged out Power Pivot because of its easy-to-use Quick Table Calculations.

Winner: Tableau

Category 3: Three Month Forecast

This is a bit of a bonus category for us.  Forecasting is a newer feature that isn't present in many tools.  For quite some time, it's been a feature for the Mathematicians and Statisticians.  However, it's becoming more and more common for business users to try for themselves.  Let's see how Power Pivot can handle this.
Forecast (Power Pivot)
We could not find a built-in way for Power Pivot to do any forecasting.  However, Excel has some very powerful trend lines that can be used to forecast future values.  Our issue is that it requires a significant amount of statistical knowledge to create good forecasts using this system.  Another downside is that the range for this chart is hardcoded.  This makes updating the chart more complex.  Now, let's see how Tableau fares.
Forecast (Tableau)
Not only can Tableau's trend lines do pretty much everything that Excel's trend lines can do, except for forecasting, but Tableau 8 has a new forecasting algorithm that uses advanced statistical procedures to account for trend and seasonality.  This would not be a menu-based task in Excel.  Once again, this is an easy decision.

Winner: Tableau

Summary

Power Pivot has some very powerful Time Intelligence that allow the user to create virtually any calculation they want.  However, they do not have many built-in calculations for the more common calculations.  On the other hand, Tableau lacks the Time Intelligence functions, but has "Quick Table Calculations" which cover the majority of the KPIs you would find.  Also, Power Pivot was completely unable to offer a reusable forecasting algorithm.  Tableau's algorithm is in its early stages, but is still very useful.  The bottom line is this: If Tableau's Quick Table Calculations can handle your KPIs, then Power Pivot can't compete.  If your KPIs are more advanced than that, then perhaps Power Pivot's functions can help you out.  Thanks for reading.  We hope you found this informative.

Brad Llewellyn
Associate Data Analytics Consultant
Mariner, LLC
brad.llewellyn@mariner-usa.com
http://www.linkedin.com/in/bradllewellyn
http://breaking-bi.blogspot.com

Monday, November 4, 2013

Tableau vs. Power Pivot Part 8: Sharing

Today, we will talk about how to share your work within these two tools.  Sharing is arguably the most important component of any analysis.  An analysis is worthless if you can't share the results with the people who need to know.  So, we are going to look at One-Time Sharing, Data Publishing, and Report Publishing.

Category 1: One-Time Sharing

Imagine that you are rummaging through your data and find an interesting pattern.  So, you compile a simple report and want to share the results at the meeting that afternoon.  The question is "How do you transfer the report?"

The easiest way is to take a screenshot.  This can be done with almost any tool in any situation.  However, it adds no interactivity, which is a major selling point for newer reporting tools.  Power Pivot and Tableau both require that you give the workbook to someone else.  But, these workbooks are usually too large to email.  Therefore, you would need to post them on SharePoint, Dropbox, Skydrive or some other online storage medium to transfer the files.

Both of these tools also require that the reader have the appropriate software installed.  If you are giving an Excel workbook to someone who does not have your version of Excel and/or Power Pivot, you will likely run into serious issues.  The same can be said of Tableau.  If the reader does not have Tableau installed on their machine, they will be unable to see your report.  However, Tableau has one distinct advantage in that they can download Tableau Reader for free.  Tableau Reader allows the reader to interact with your Tableau workbook without being able to alter anything.  To our knowledge, there is no free software that allows someone to interact with Power Pivot/Power View reports.  Therefore, Tableau inches this one out.

Winner: Tableau

Category 2: Data Publishing

Imagine that you are an analyst who spent countless hours compiling all sorts of data in your Power Pivot or Tableau workbook.  One of your colleagues loves your work and says "Can I see your data?  I'd like to do some analysis of my own."  How do you handle this?

Regardless of which tool you are using, one option is to give him a copy of your workbook.  However, this is a one-time transfer and will likely cause issues with data consistency.

For Power Pivot, the better option is to publish your workbook to SharePoint.  When the workbook is published to SharePoint, anyone with the appropriate credentials can connect directly to the data using an Analysis Services or ODBC connection.  The only concern here is that SharePoint is not free.

Tableau has a very similar mechanic when you publish your data connection to Tableau Server.  You can publish any type of connection to the Tableau Server.  However, your calculated fields will be lost if you are not using a Tableau Data Extract.  If you are using an extract, you can store all of your row-level calculations in the extract itself before you publish it.  This is referred to as "Optimizing."  But, you will still lose all of your other calculated fields.  Also, if the workbook has more than one data connection that is being blended together, these connections will have to be published separately, unless you can find a way to join them into one data source.

EDIT: A reader pointed out that this paragraph was misleading.  If you create a calculated field in Tableau, then publish the data source, all of your calculated fields will be stored as metadata and must be recalculated by Tableau every time that you use them.  If you want Tableau to calculate the row-level values once and store the results in the extract, then you need to use the process known as "optimizing."

As an added bonus, Power Pivot allows many other tools to connect to its data.  All you need is an Analysis Services/ODBC connector.  Tableau, on the other hand, would require the user to view the data, then copy it into Excel, which can be quite a hassle if the data set is large.  In fact, it is impossible to copy more than a million rows into Excel.  Our point is this: Power Pivot allows users to model data then export it to another tool; whereas Tableau wants to be the end of the line.  It seems we have a clear winner here as well.

Winner: Power Pivot

Category 3: Report Publishing

Now, you've created an awesome report that is now the "Gold Standard" for reporting in your business.  Every other analyst needs to look at your report every day to see how the current metrics/KPIs are faring.  How do you accomplish this?

These tools are almost identical in how they accomplish this task.  You can publish a Power View report to SharePoint, where you can set an automatic refresh cycle.  Tableau does the same using Tableau Server.  Seeing as how this is the designed end-game for these tools, they are nearly identical in how they accomplish them.  Also, we won't get into the caching mechanisms behind these tools.  However, we will say that they both create caches that allow users the freedom to quickly view reports once they have been initialized.

EDIT: Ray M pointed out that Power BI's mobile publishing is not up to par with Tableau's.  From our experience, that's one of the selling points of Tableau is it's ability to work on most mobile devices.  We haven't heard much about successes for Power BI on that front.

Winner: Tie

Summary

We saw that Tableau barely edged out Power Pivot in One-Time Sharing because of its free reader.  On the other hand, Power Pivot demolished Tableau with its ability to share data models.  We hope you found this informative.  Thanks for reading.

Brad Llewellyn
Associate Data Analytics Consultant
Mariner, LLC
brad.llewellyn@mariner-usa.com
http://www.linkedin.com/in/bradllewellyn
http://breaking-bi.blogspot.com