tag:blogger.com,1999:blog-3721871707048712457.post2971882092054927067..comments2024-09-14T01:53:44.153-04:00Comments on Breaking BI: Predictive Analytics in Tableau Part 2: Linear Regression with Multiple RegressorsBreaking BIhttp://www.blogger.com/profile/02551920506874509998noreply@blogger.comBlogger18125tag:blogger.com,1999:blog-3721871707048712457.post-54363979319329987512016-03-01T08:35:54.893-05:002016-03-01T08:35:54.893-05:00When potting residuals, check that your table calc...When potting residuals, check that your table calculation has the correct "compute using" setting. If may default to Table(Across) when you need it to be the set to the key that identifies each row, which in the example above, is YEAR.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-18114156931177254742015-12-11T11:42:09.279-05:002015-12-11T11:42:09.279-05:00I keep getting a perfect model fit & my residu...I keep getting a perfect model fit & my residuals are zero. what is wrong witn my calculating? thanksRedhttps://www.blogger.com/profile/17201983956329525009noreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-26269847679137473162015-11-19T01:02:12.428-05:002015-11-19T01:02:12.428-05:00I keep getting a perfect model fit & my residu...I keep getting a perfect model fit & my residuals are zero. I have a large data set & purposefully duplicated all variables in one row with two different y values..& it STILL predicts perfectly. I'm obviously doing something wrong. Can you help?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-56756656064790036492015-11-19T00:57:02.164-05:002015-11-19T00:57:02.164-05:00I keep getting a perfect model fit & my residu...I keep getting a perfect model fit & my residuals are zero. I have a large data set & purposefully duplicated all variables in one row with two different y values..& it STILL predicts perfectly. I'm obviously doing something wrong. Can you help?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-20702828893618612582015-11-03T19:24:13.114-05:002015-11-03T19:24:13.114-05:00How can I show the regression model's coeffici...How can I show the regression model's coefficients in Tableau? eg. We can use something like summary(fit)$coefficients[,1] in R to show one of the coefficients, but how to show it in Tableau? Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-67569886965899862412014-09-17T10:58:13.255-04:002014-09-17T10:58:13.255-04:00All right, thanks a lot.
If I make it work I will...All right, thanks a lot.<br /><br />If I make it work I will definitely post it here!<br /><br />GabrielAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-69579304694516265252014-09-17T10:25:31.958-04:002014-09-17T10:25:31.958-04:00My mistake. I misunderstood your data structure. ...My mistake. I misunderstood your data structure. So, you have a data set with X rows and columns Score, A, B, C, D, E, F, G + some more dimensions. The issue here is that you would traditionally place your ID field (unique ID for each row/observation) onto the Detail Shelf along with AVG( [Score] ), AVG( [A] ), etc. This would mean that your Scatterplot has X points on the chart, not the 7 that you would want in order to plot a point for each Regressor. To simplify the story substantially, what you're asking for is possible, yet extremely complex. It would require an expert knowledge of data structure and Table Calculations as well as a decent amount of time to work through it. After that, it would likely not perform very well given any reasonable amount of data.<br /><br />If you still wish to attempt this, I'll give you one piece of advice "The R Script is nothing more than a table calculation, and you can treat it as such."<br /><br />Sorry I didn't have better news. If you do figure it out, please post it up here. It would be a great experience for myself and the other readers.<br /><br />Cheers,<br /><br />Brad LlewellynBreaking BIhttps://www.blogger.com/profile/02551920506874509998noreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-73625213799262788772014-09-16T19:46:32.653-04:002014-09-16T19:46:32.653-04:00Good to know it's the most frustrating part, I...Good to know it's the most frustrating part, I thought maybe I was the only one facing this issue!<br /> <br />I am not sure what you mean by doing rep( P, 8 ). What I am trying to achieve is to get those values (the 8 coefficients and 8 p-values) in a single table in Tableau. The best way would be to get a matrix from R but since we get only a single value I am wondering how I can modify my code in order to get those values into Tableau. <br /><br />The visulization I'm looking for is a bubble chart:<br />X axis = average Score <br />Y axis = Coefficient <br />The bubble would be A-B-C-D-E-F-G according to their average score and coefficient.<br />And I would filter by p-value > 0.05 to show only the attributes that are statistically significant.<br /><br />Tell me if I’m not being clear enough, I can send you an example of what I’m trying to achieve.<br /><br />Thanks,<br /><br />Gabriel<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-76061901878059506162014-09-16T16:56:30.490-04:002014-09-16T16:56:30.490-04:00Thanks for commenting! You've run into the mo...Thanks for commenting! You've run into the most frustrating part of the Tabeau/R integration. The data that R returns has have the same length as the data you send to it. For instance, if you send in vector #1, which has 8 values, you must return 8 values. So, if you wanted to return the p-value, you have to do rep( P, 8 ). Does this make sense?Breaking BIhttps://www.blogger.com/profile/02551920506874509998noreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-53420119357574647282014-09-16T14:38:47.406-04:002014-09-16T14:38:47.406-04:00Hi Brad,
Your blog is awesome, it's helping m...Hi Brad,<br /><br />Your blog is awesome, it's helping me a lot.<br /><br />I'm having issues with the multiple regression procedure in Tableau.<br /><br />I am using the following code:<br /><br />SCRIPT_REAL("<br />Score <- .arg1<br />A <- .arg2<br />B <- .arg3<br />C <- .arg4<br />D <- .arg5<br />E <- .arg6<br />F <- .arg7<br />G <- .arg8<br /><br />fit <- lm( OSAT ~ A + B + C + D + E + F + G )<br />fit$fitted<br />"<br />, avg([Score]), avg([A]), avg([B]), avg([C]), avg([D]), avg([E]), avg([F]), avg([G]))<br /><br />What I actually need is the coefficient score and the P-value for each of those arguments (A to G)<br /><br />So I tried changing the fit$fitted for something like this:<br /><br />fit$coefficients<br /><br />Also, the data frame I am using come from a survey database. It looks like this:<br /><br />Respondent # | Score | A | B | C | D | E | F | G| <br />#1 | 10 | 9 | 9 | 8 | 7 | 6 | 7 | 5<br />#2 | 5 | 4 | 2 | 3 | 5 | 7 | 8 | 6 <br />...<br />i<br /><br />So what I am trying to achieve is to get the coefficients and p-value on a single sheet for all those arguments (A to G) in order to make a scatter plot.<br /><br />Thanks a lot<br /><br /><br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-60940179897528459042014-06-19T17:54:38.603-04:002014-06-19T17:54:38.603-04:00In a simple way, no. The data you get back from R...In a simple way, no. The data you get back from R has to be 1-1 with the data you send in. So, if you send in 20 rows of data (considering a tabular), you will get 20 values back. You can get each value back individually if you'd like....and display them on a chart. But you can't see the R output within R.Breaking BIhttps://www.blogger.com/profile/02551920506874509998noreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-81490988302729430682014-06-19T17:42:34.506-04:002014-06-19T17:42:34.506-04:00Thank you Brad. Your blog was helpful. is it possi...Thank you Brad. Your blog was helpful. is it possible to view a summary of the linear regression? I am thinking the R function "summary(lm)" in tableau. Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-47665256885359059762014-06-19T12:27:33.306-04:002014-06-19T12:27:33.306-04:00Thanks for commenting! Your problem is likely due...Thanks for commenting! Your problem is likely due to the "Compute Using". The R interface is actually just a table calculation. So, when you change your dimension from Year to Company, you will probably need to change your Compute Using as well. Hopefully this post will help you.<br /><br />http://breaking-bi.blogspot.com/2013/07/introduction-to-table-calculations.htmlBreaking BIhttps://www.blogger.com/profile/02551920506874509998noreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-8232489208740881432014-06-19T11:44:43.430-04:002014-06-19T11:44:43.430-04:00Thanks very much for the informative session. I...Thanks very much for the informative session. I'm having a bit of a problem. If I would like to see each of the predicted y value, how would I be able to do that? It seems like Tableau is putting everything into the SUM, and if I don't want to see it by year, rather, by company or by field, it stops working?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-8696810145177750292014-05-12T08:40:01.390-04:002014-05-12T08:40:01.390-04:00Edward,
Thanks for commenting! As far as I can s...Edward,<br /><br />Thanks for commenting! As far as I can see, there are two issue with your code. First, in the ## Defining Variables section, you shouldn't use commas to end the lines. R automatically executes a line when you "Carriage Return". Second, the order of the variables in the Define Variables section need to be the same as the order in the last line of the code where you have all of the SUM(). Otherwise, you will get false results.<br /><br />Thanks!Breaking BIhttps://www.blogger.com/profile/02551920506874509998noreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-71342326735831966362014-05-12T06:04:07.761-04:002014-05-12T06:04:07.761-04:00Good day Sir,
I am new to the R language, I just ...Good day Sir,<br /><br />I am new to the R language, I just started to educate myself with it just this year. I was trying to perform a linear model with multiple variables with my Tableau. My syntax is:<br /><br />SCRIPT_REAL( ”<br /><br />## Defining Variables<br /><br />[GVA]<- .arg1,<br />[Emp]<- .arg2,<br />[Surv]<- .arg3,<br />[CP]<- .arg4,<br /><br />## Fitting the Model<br /><br />fit <- lm( GVA ~ Emp + CP + Surv)<br />fit$fitted<br />"<br />,SUM( [GVA]), SUM([CP]), SUM([Emp]), SUM([Surv]))<br /><br />but I received an error saying:<br /><br />Error in base::parse(text = .cmd) : :5:5: unexpected ‘[‘<br />4:<br />5: [<br />^<br />What seems to be the problem? Thank you so much.Anonymoushttps://www.blogger.com/profile/15999919819967651333noreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-26689396141277002932014-01-06T15:08:01.464-05:002014-01-06T15:08:01.464-05:00That's a really good question. The predict fu...That's a really good question. The predict function returns a table with the following columns: Prediction, Lower Bound, and Upper Bound. Therefore, since we need to pull out one column at a time, we give it a column number to pull out. We could have also used [,1], but that would give us that same values we get when we use fit$fitted.Breaking BIhttps://www.blogger.com/profile/02551920506874509998noreply@blogger.comtag:blogger.com,1999:blog-3721871707048712457.post-54667812201303477222014-01-06T10:48:10.840-05:002014-01-06T10:48:10.840-05:00Thanks for this series. I do have a question howev...Thanks for this series. I do have a question however. When writing the calc for the prediction intervals, what do the[,2] and [,3] represent? It obviously gives the lower and upper interval, but what specifically does it mean?<br /><br />Thanks,Anonymousnoreply@blogger.com