First, let's look at our data.
DJIA vs. Foreign by Year |
DJIA vs. Foreign by Year (with Linear Trend) |
DJIA vs. Foreign by Year (with Quadratic Trend) |
DJIA vs. Foreign by Year (with Exponential Trend) |
DJIA (Quadratic) |
DJIA (Exponential) |
We see that all we have to do is make a new variable with whatever function we want and add it to the model. As far as R knows, these are two completely different variables. We could add all of the other variables to the mix as well if we wanted to. We could easily model Foreign as a quadratic while modeling Consumer as a logarithm. One important thing to note is that if you include a polynomial, you should include all lesser degrees. Simply put, if you have a x^2, then you must also have x. If you have x^3, you must also have x^2 and x. We won't go into detail about why you should do this at this time.
Looking back at the earlier trends, we're not happy with either of these trends. The quadratic trend has that troubling curvature at the left side while the exponential trend doesn't seem strong enough to capture the upward curvature. Now, let's make a new model that Tableau doesn't even have!
Looking back at the earlier trends, we're not happy with either of these trends. The quadratic trend has that troubling curvature at the left side while the exponential trend doesn't seem strong enough to capture the upward curvature. Now, let's make a new model that Tableau doesn't even have!
What if we believe that there are actually two trends here? Let's imagine that the relationship is linear for small values of Foreign (less than 135) and a different linear relationship for larger values of Foreign? No problem! We can combine our models!
SCRIPT_REAL("
djia <- .arg1
fore <- .arg2
th <- 135
fore2 <- fore^2
smallfore <- fore[fore<th]
smalldjia <- djia[fore<th]
largefore <- fore[fore>=th]
largedjia <- djia[fore>=th]
smallfit <- lm( smalldjia ~ smallfore )
largefit <- lm( largedjia ~ largefore )
c(smallfit$fitted, largefit$fitted)
",
SUM( [DJIA] ), SUM( [FOREIGN] ) )
Finally, let's see the results.
DJIA vs. Foreign by Year (with Piecewise Prediction) |
This model fits our data so much better! This method is called Piecewise Regression. The amazing thing about R is that there's a method for predicting anything you want. You can even create your own prediction method if you need to. Thanks for reading. We hope you found this informative.
P.S.
We're curious how Tableau keeps track of which value in the output vector corresponds to each input. This was a huge obstacle in Part 3 as well. If you have ideas, let us know in the comments.
P.S.
We're curious how Tableau keeps track of which value in the output vector corresponds to each input. This was a huge obstacle in Part 3 as well. If you have ideas, let us know in the comments.
Brad Llewellyn
Data Analytics Consultant
Mariner, LLC
llewellyn.wb@gmail.com
https://www.linkedin.com/in/bradllewellyn
Data Analytics Consultant
Mariner, LLC
llewellyn.wb@gmail.com
https://www.linkedin.com/in/bradllewellyn
Hi Brad, this is a great series! To answer your question about keeping track of the values, my understanding is that the ordering of what is sent to R from Tableau is based on the sort order of the addressing dimension(s). That creates a vector for each argument, and on the return from R to Tableau the assignment is in the same order. So the 1st address aka row in the partition gets the first value of the return vector, the 2nd address gets the 2nd value, and so on.
ReplyDeleteThanks for the input Jonathan. That's what I suspected was the case. That concerns me when dealing with Nulls and functions where Time Series are involved. Perhaps I will have to investigate this a little further.
DeleteThanks a lot Brad. The posts on Predictive Analytics are really helpful.
ReplyDeleteDaer Brad. This post are very interessting. Could you please help me to write the right syntax in Tableau for this R-syntax:
ReplyDeletereg.quad <- nls (av~a*uv^2^2+ b*uv^2+c, data=Datensatz, start=list (a=51.75045, b=77.30759 , c=45.07692))
I am so sorry because i am a tableau beginner.
Dear Brad,
ReplyDeletefirst of all: thank you for this series. I' m sorry my english isn' t very good.
I want to calculate different (linear/polynom/exponential) regressions.
For example I write this syntax for tablaeu:
reg.quad <- nls (y~a*x^2^2+ b*x^2+c, data=Datensatz, start=list (a=51.75045, b=77.30759 , c=45.07692))
I don' t know how to write a,b,c in tablaeu.
Could you help me?
Melanie