TechOpsGuys.com Diggin' technology every day

May 17, 2013

Big pop in Tableau IPO

Filed under: General — Tags: , — Nate @ 9:35 am

I was first introduced to Tableau (and Vertica) a couple of years ago at a local event in Seattle. Both products really blew me away(and still do to this day). Though it’s not an area I spend a lot of time in – my brain struggles with anything analytics related (even when using Tableau, same goes for Splunk, or SQL). I just can’t make the connections, when I come across crazy Splunk queries that people write I just stare at it for a while in wonder(as in I can’t possibly imagine how someone could of come up with such a query even after working with Splunk for the past six years).. then I copy+paste it and hope it works.

Sample Tableu reports pulled from google images

But that doesn’t stop me from seeing an awesome combination that is truly ground breaking both in performance and ease of use.

I’ve seen people try to use Tableau with MySQL for example and they fairly quickly give up in frustration at how slow it is. I remember being told that Tableau used to get a bunch of complaints from users years ago saying how slow it seemed to be — but it really wasn’t Tableau’s fault it was the slow back end data store.

Vertica unlocks Tableau’s potential by providing a jet engine to run your queries against. Millions of rows? hundreds of millions? No problem.. billions ? It’ll take a bit longer but shouldn’t be an issue either. Try that with most other back ends and well you’ll be waiting there for days if not weeks.

Tableau is a new generation of data visualization technology that is really targeted at the Excel crowd. It can read in data from practically anything(Excel files included), and it provides a seamless way to analyze your data and provide fancy charts and graphs, tables and maps..

It’s not really for the hard core power users who want to write custom queries. Though I still think it is useful for those folks. A great use case for Tableau is for the business users to play around with it, and come up with the reports that they find useful, then the data warehouse people can take those results and optimize the warehouse for those types of queries (if required). It’s a lot simpler and faster than the alternative..

I remember two years ago I was working with a data warehouse guy at a company and we were testing Tableau with MySQL at the time actually (small tables), just playing around, he poked around, created some basic graphs and drilled down into them. In all we spent about 5 minutes on this task and we found some interesting information. He said if he had to do that in MySQL queries himself it would of taken him roughly two days. Running query after query and then building new queries based on results etc.  From two days to roughly five minutes — for a very experienced SQL/data warehouse person.

Tableau has a server component as well, which you can publish your reports for others to see with a web browser or mobile device, the server can also of course directly link to your data to get updates as frequently as you want them.

You can have profiles and policies, one example Tableau gave me last year was one big customer enforces certain color codes across their organization so no matter what they are looking at they know Blue means X and Orange means Y. This is enforced at the server level, so it’s not something people have to worry about remembering. They can also enforce policies around reporting so that the term “XYZ” is always the result of “this+that”, so people get consistent results every time — not a situation where someone interprets something one way, and another person another way. Again this is enforced at the server level, reducing the need for double checking and additional training.

They also have APIs – and users are able to embed Tableau reports directly into their applications and web sites(through the server component). I know one organization where almost all of their customer reporting is presented with Tableau – I’m sure it saved them a ton of time trying to replicate the behavior in their own code. I’ve seen folks try to write reporting UIs in past companies and usually what comes out is significantly sub par because it’s a complicated thing to get right. Tableau makes it easy, and probably very cost effective relative to full time developers taking months/years to try to do it yourself.

It’s one of the few products out there that I am really excited about, and I’ve seen some amazing stuff done with the software in a very minimal amount of time.

Tableau has a 15 day evaluation period if you want to try it out — it really should be more, but whatever.  Vertica has a community edition which you can use as a sort of long term evaluation – it’s limited to 1TB of data and 3 cluster nodes. You can get a full fledged enterprise evaluation from Vertica as well if you want to test all of the features.

I wrote some scripts at my current company to refresh/import about 150GB of data from our MySQL systems to Vertica each night. It is interesting to see MySQL struggle to read the data out, and Vertica is practically idle as it ingests it (I’d otherwise normally think that the writing of the data would be more intensive than the reading). In order to improve performance I compiled a few custom MySQL binaries that allowed me to run MySQL queries and pipe the results directly into Vertica (instead of writing 10s of GBs to disk only to read it back again). The need for the custom binaries is MySQL by default only supports tab delimited results which was not sufficient for this data set (I actually compiled 3-4 different binaries with different delimiters depending on the tables  – managed to get ~99.99999999% of the rows in without further effort). Also wrote a quick perl script to fix some of the invalid data like invalid time stamps which MySQL happily allows but Vertica does not.

Sample command:

$MYSQL --raw --batch --quick --skip-column-names -u $DB_USERNAME --password="${DB_PASSWORD}" --host=${DB_SERVER} $SOURCE_DBNAME -e "select * from $MY_TABLE" | $DATA_FIX | vsql -w $VERTICA_PASSWORD -c "COPY ${VERTICA_SCHEMA}.${MY_TABLE} FROM STDIN DELIMITER '|' RECORD TERMINATOR '##' NULL AS 'NULL' DIRECT"

 

Oh and back to the topic of the post – Tableau IPO’d today (ticker is DATA) – as of last check it is up 55%.

So, congrats Tableau on a great IPO!

 

1 Comment

  1. […] Nate, techopsguys, Big pop in Tableau IPO, here. […]

    Pingback by Techopsguys and Data Center Knowledge | Pink Iguana — June 5, 2013 @ 10:07 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress