Do I Really need Tableau

I made a post a couple of days ago in response to a comment about automated dashboards in tableau, and why I have abandoned tableau in favour of R and RStudio. It got me thinking about my reasons why and I abandoned Tableau about 18 months ago. Initially, this was simply cost based. Once upon a time, Tableau desktop (and Perl) did everything I needed. My workflow wasn't great, but it worked, and it got me the visualisations I needed to help diagnose root cause problems. I could install TDE on a client’s machine and apply the license and it was great. I recommended it to clients for ongoing performance analysis and life was good. Now, Tableau is a 'platform' with an ongoing monthly licensing fee and a dependency on a server. It’s hard enough to get access in corporate IT to install anything, let alone have a server stood up as well. So, what do I do? I've tried PowerBI and it didn't do what I needed it to do. I looked at some opensource platforms, which looked like they would require more customization than I was capable of. So, finally after discounting everything else - I settled on R. The reason this was the last option was simply a function of me not wanting to learn another language. I wanted drag and drop. I was wrong. I am now a fully paid up convert of the "why you can't do data science in a GUI" club, (and thanks to Sam Marshall I even have the mug to prove it).

If you get time, have a watch of https://www.youtube.com/watch?v=cpbtcsGE0OA

Why would a (former) tableau evangelist become an R fan? The initial driver was cost. R and R Studio are free – however with most free things there is a learning curve. What you save in licensing you pay for in time. Happily, for R, this wasn’t the case. Creating a basic x-y dotplot of results in R is simple, and within an hour you can have it sorted and do some basic things with it. Of course, being code – once you’ve got it working, you can save it re-use and share to your hearts content. This is the core to what I see as the languages biggest strength over a GUI based tool like Tableau, and Hadley mentioned it in his talk linked above. If I give you an R script that I have used to perform some analytics on some data, you can see what my thought processes are and how I have done things. It’s all there and its reproducible. How great is the concept for instance that I have my analysis under source control? Also, R is faster. I don’t necessarily mean in terms of plotting big set of results what I am referring to is once I have my analysis path coded, all I need to do is provide a new data set and run it again. Two very smart guys I work with Sam Marshal and Malachi McIntosh have next leveled this by automating end of run reports using R markdown showing the current results set as well as a statistical comparison between that and the previous run, as well as being able to auto generate PowerPoints showing summary results and plots for all executions in a project.

This is only scratching the surface. The wealth of data science, statistical and data tidying libraries available to extend base R is frankly overwhelming – the first time you use the tidyverse to_long and to_wide functions will be a life changing moment if you’ve ever needed to rotate a large data set. This is where R’s real strength is – it is designed for stats and data science. Its visualizations might not be as those provided by tableau in terms of anti-aliasing – but that’s not really the point. The point is what can a tool help me do to get my data into a state where I can start getting insights from it - Need to trace a GUID through a dataset? No problems – group_by has your back on that one - and while you’re at it why not get it to calculate the difference between the start and end times for you with the cunning use of summarize and just for fun we might as well calculate out our percentiles at the same time per transaction. Lets really go nuts and instead of plotting the results as a dot plot – why not plot key events as a line range plot and flip the axis to show what the impacts of concurrency are on processing time. Why not create some buckets for each transaction (n=5) over our data set and throw six-sigma over it to see if there are any significant variances that I could potentially look at and use those time periods as a filter over our systems metrics?

Would I go back to Tableau? Its not so much a question of would I – its more a question of what the actual need is. I find now with some experience with R, and a library of functions I have created over the last 18 months I seem to be a at a point where I have faced no limitations within R that have made me think “I really need tableau”. The flipside is I have done and do do things in R which have made me thought “This would have been very hard in Tableau” – simply because R has statistical and data science foundations.

So back to the question: would I go back to tableau? No. The licensing model makes it incompatible with consultancy - and even if it didn’t the flexibility of R is far higher than what I had in Tableau. 

It's quite good that R.

Like
Reply

I'll get there one day. It's on my list of things to experiment with. I'd love to move away from a commercial analysis tool but right now we have more pressing challenges.

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore content categories