Today, we will talk about creating time series analyses using Tableau 8.1's new R functionality. In our previous demonstrations, we used varying regression techniques to predict values. These techniques took into account the relationships between the variables, but not across time. Time Series Analyses approach from the opposite direction. They attempt to predict new values by looking at previous (or future) values in time, but only considering one variable. For this demonstration, we will use the Superstore Sales sample data set from Tableau.

As always, the first step is to look at the data we want to forecast. Let's look at Total Sales per Month.

Sales by Month |

Months to Forecast |

Install Packages |

This code is long and somewhat complex. So, we'll only touch on the important parts. The code for Sales (Forecast) is as follows:

SCRIPT_REAL("

library(forecast)

## Creating vectors

hold.orig <- .arg4

len.orig <- length( hold.orig )

len.new <- len.orig - hold.orig[1]

year.orig <- .arg1

month.orig <- .arg2

sales.orig <- .arg3

## Sorting the Data

date.orig <- year.orig + month.orig / 12

dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]

dat.new <- dat.orig[1:len.new,]

## Fitting the Time Series

timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1],

dat.new[len.new,2]), frequency = 12)

fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)

c(rep(NA,len.new), forecast(fit, hold.orig[1])$mean)

"

, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),

[Months to Forecast] )

First, we need to install the forecast library using library(forecast). Then, we need to sort the data by date, and hold out the last few months. How many months get held out depends on the value of our [Months to Forecast] parameter. Finally, we fit our time series as a single exponential model and output the results. Let's see what it looks like.

Sales by Month (with Forecast) |

**Sales (Forecast) (Lower 80%)**

c(rep(NA,len.new), forecast(fit, hold.orig[1])$lower[,1])

**Sales (Forecast) (Lower 95%)**

c(rep(NA,len.new), forecast(fit, hold.orig[1])$lower[,2])

**Sales (Forecast) (Upper 80%)**

c(rep(NA,len.new), forecast(fit, hold.orig[1])$upper[,1])

**Sales (Forecast) (Upper 95%)**

c(rep(NA,len.new), forecast(fit, hold.orig[1])$upper[,2])

Now, let's see what the intervals look like

Sales by Month (with Forecast and Bounds) |

Adding Empty Dates |

Sales by Month (with Future Forecast and Bounds) |

P.S.

If anybody can think of a better way to get future forecasts for dates that don't exist in the data, please let us know in the comments. Also, this method is not robust against missing data. If you're data has some missing months/week/etc., you may have to alter the method slightly.

Brad Llewellyn

Data Analytics Consultant

Mariner, LLC

llewellyn.wb@gmail.com

https://www.linkedin.com/in/bradllewellyn

Data Analytics Consultant

Mariner, LLC

llewellyn.wb@gmail.com

https://www.linkedin.com/in/bradllewellyn

**APPENDIX:**

**The code for Sales (Forecast) (Lower 80%) is as follows:**

SCRIPT_REAL("

library(forecast)

## Creating vectors

hold.orig <- .arg4

len.orig <- length( hold.orig )

len.new <- len.orig - hold.orig[1]

year.orig <- .arg1

month.orig <- .arg2

sales.orig <- .arg3

## Sorting the Data

date.orig <- year.orig + month.orig / 12

dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]

dat.new <- dat.orig[1:len.new,]

## Fitting the Time Series

timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1], dat.new[len.new,2]), frequency = 12)

fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)

c(rep(NA,len.new), forecast(fit, hold.orig[1])$lower[,1])

"

, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),

[Months to Forecast] )

**The code for Sales (Forecast) (Lower 95%) is as follows:**

**SCRIPT_REAL("**

library(forecast)

## Creating vectors

hold.orig <- .arg4

len.orig <- length( hold.orig )

len.new <- len.orig - hold.orig[1]

year.orig <- .arg1

month.orig <- .arg2

sales.orig <- .arg3

## Sorting the Data

date.orig <- year.orig + month.orig / 12

dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]

dat.new <- dat.orig[1:len.new,]

## Fitting the Time Series

timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1], dat.new[len.new,2]), frequency = 12)

fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)

c(rep(NA,len.new), forecast(fit, hold.orig[1])$lower[,2])

"

, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),

[Months to Forecast] )

**The code for Sales (Forecast) (Upper 80%) is as follows:**

**SCRIPT_REAL("**

library(forecast)

## Creating vectors

hold.orig <- .arg4

len.orig <- length( hold.orig )

len.new <- len.orig - hold.orig[1]

year.orig <- .arg1

month.orig <- .arg2

sales.orig <- .arg3

## Sorting the Data

date.orig <- year.orig + month.orig / 12

dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]

dat.new <- dat.orig[1:len.new,]

## Fitting the Time Series

timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1], dat.new[len.new,2]), frequency = 12)

fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)

c(rep(NA,len.new), forecast(fit, hold.orig[1])$upper[,1])

"

, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),

[Months to Forecast] )

**The code for Sales (Forecast) (Upper 95%) is as follows:**

**SCRIPT_REAL("**

library(forecast)

## Creating vectors

hold.orig <- .arg4

len.orig <- length( hold.orig )

len.new <- len.orig - hold.orig[1]

year.orig <- .arg1

month.orig <- .arg2

sales.orig <- .arg3

## Sorting the Data

date.orig <- year.orig + month.orig / 12

dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]

dat.new <- dat.orig[1:len.new,]

## Fitting the Time Series

timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1], dat.new[len.new,2]), frequency = 12)

fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)

c(rep(NA,len.new), forecast(fit, hold.orig[1])$upper[,2])

"

, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),

[Months to Forecast] )

Alex Macharia

ReplyDeleteCreate a shift date calculated field for future dates by

dateadd('month', [Months to Forecast],[Order Date])

How can you alter the code to only forecast by year? I keep getting an error: Error in dat.orig[1:len.new, ] : subscript out of bounds. Thank you very much for any help

ReplyDeleteHello,

ReplyDeleteThis blog is really helpful. I tried using the forecasting, but whenever I used the Sales(forecast) field it shows me the following error: Error in library(forecast) : there is no package called 'forecast'.

I have installed the forecast package in R.

Can you please help me solve this error.

Thank you.

After adding the blank lines, Tableau retuns: "An error occurred while communication with the Rserve service", "Error in hw(p, beta, gamma) : NA/NaN/Inf in foreign function call (arg 1)"

ReplyDeleteAny ideas on a fix?