Today, we will talk about creating time series analyses using Tableau 8.1's new R functionality. In our previous demonstrations, we used varying regression techniques to predict values. These techniques took into account the relationships between the variables, but not across time. Time Series Analyses approach from the opposite direction. They attempt to predict new values by looking at previous (or future) values in time, but only considering one variable. For this demonstration, we will use the Superstore Sales sample data set from Tableau.
As always, the first step is to look at the data we want to forecast. Let's look at Total Sales per Month.
Sales by Month |
Months to Forecast |
Install Packages |
This code is long and somewhat complex. So, we'll only touch on the important parts. The code for Sales (Forecast) is as follows:
SCRIPT_REAL("
library(forecast)
## Creating vectors
hold.orig <- .arg4
len.orig <- length( hold.orig )
len.new <- len.orig - hold.orig[1]
year.orig <- .arg1
month.orig <- .arg2
sales.orig <- .arg3
## Sorting the Data
date.orig <- year.orig + month.orig / 12
dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]
dat.new <- dat.orig[1:len.new,]
## Fitting the Time Series
timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1],
dat.new[len.new,2]), frequency = 12)
fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$mean)
"
, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),
[Months to Forecast] )
First, we need to install the forecast library using library(forecast). Then, we need to sort the data by date, and hold out the last few months. How many months get held out depends on the value of our [Months to Forecast] parameter. Finally, we fit our time series as a single exponential model and output the results. Let's see what it looks like.
Sales by Month (with Forecast) |
Sales (Forecast) (Lower 80%)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$lower[,1])
Sales (Forecast) (Lower 95%)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$lower[,2])
Sales (Forecast) (Upper 80%)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$upper[,1])
Sales (Forecast) (Upper 95%)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$upper[,2])
Now, let's see what the intervals look like
Sales by Month (with Forecast and Bounds) |
Adding Empty Dates |
Sales by Month (with Future Forecast and Bounds) |
P.S.
If anybody can think of a better way to get future forecasts for dates that don't exist in the data, please let us know in the comments. Also, this method is not robust against missing data. If you're data has some missing months/week/etc., you may have to alter the method slightly.
Brad Llewellyn
Data Analytics Consultant
Mariner, LLC
llewellyn.wb@gmail.com
https://www.linkedin.com/in/bradllewellyn
Data Analytics Consultant
Mariner, LLC
llewellyn.wb@gmail.com
https://www.linkedin.com/in/bradllewellyn
APPENDIX:
The code for Sales (Forecast) (Lower 80%) is as follows:
SCRIPT_REAL("
library(forecast)
## Creating vectors
hold.orig <- .arg4
len.orig <- length( hold.orig )
len.new <- len.orig - hold.orig[1]
year.orig <- .arg1
month.orig <- .arg2
sales.orig <- .arg3
## Sorting the Data
date.orig <- year.orig + month.orig / 12
dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]
dat.new <- dat.orig[1:len.new,]
## Fitting the Time Series
timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1], dat.new[len.new,2]), frequency = 12)
fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$lower[,1])
"
, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),
[Months to Forecast] )
The code for Sales (Forecast) (Lower 95%) is as follows:
SCRIPT_REAL("
library(forecast)
## Creating vectors
hold.orig <- .arg4
len.orig <- length( hold.orig )
len.new <- len.orig - hold.orig[1]
year.orig <- .arg1
month.orig <- .arg2
sales.orig <- .arg3
## Sorting the Data
date.orig <- year.orig + month.orig / 12
dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]
dat.new <- dat.orig[1:len.new,]
## Fitting the Time Series
timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1], dat.new[len.new,2]), frequency = 12)
fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$lower[,2])
"
, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),
[Months to Forecast] )
The code for Sales (Forecast) (Upper 80%) is as follows:
SCRIPT_REAL("
library(forecast)
## Creating vectors
hold.orig <- .arg4
len.orig <- length( hold.orig )
len.new <- len.orig - hold.orig[1]
year.orig <- .arg1
month.orig <- .arg2
sales.orig <- .arg3
## Sorting the Data
date.orig <- year.orig + month.orig / 12
dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]
dat.new <- dat.orig[1:len.new,]
## Fitting the Time Series
timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1], dat.new[len.new,2]), frequency = 12)
fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$upper[,1])
"
, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),
[Months to Forecast] )
The code for Sales (Forecast) (Upper 95%) is as follows:
SCRIPT_REAL("
library(forecast)
## Creating vectors
hold.orig <- .arg4
len.orig <- length( hold.orig )
len.new <- len.orig - hold.orig[1]
year.orig <- .arg1
month.orig <- .arg2
sales.orig <- .arg3
## Sorting the Data
date.orig <- year.orig + month.orig / 12
dat.orig <- cbind(year.orig, month.orig, sales.orig)[sort(date.orig, index.return = TRUE)$ix,]
dat.new <- dat.orig[1:len.new,]
## Fitting the Time Series
timeser <- ts(dat.new[,3], start = c(dat.new[1,1], dat.new[1,2]), end = c(dat.new[len.new,1], dat.new[len.new,2]), frequency = 12)
fit <- HoltWinters(timeser, beta=FALSE, gamma=FALSE)
c(rep(NA,len.new), forecast(fit, hold.orig[1])$upper[,2])
"
, ATTR( YEAR( [Order Date] ) ), ATTR( MONTH( [Order Date] ) ), SUM( [Sales] ),
[Months to Forecast] )
Alex Macharia
ReplyDeleteCreate a shift date calculated field for future dates by
dateadd('month', [Months to Forecast],[Order Date])
How can you alter the code to only forecast by year? I keep getting an error: Error in dat.orig[1:len.new, ] : subscript out of bounds. Thank you very much for any help
ReplyDeleteHello,
ReplyDeleteThis blog is really helpful. I tried using the forecasting, but whenever I used the Sales(forecast) field it shows me the following error: Error in library(forecast) : there is no package called 'forecast'.
I have installed the forecast package in R.
Can you please help me solve this error.
Thank you.
After adding the blank lines, Tableau retuns: "An error occurred while communication with the Rserve service", "Error in hw(p, beta, gamma) : NA/NaN/Inf in foreign function call (arg 1)"
ReplyDeleteAny ideas on a fix?