# DataViz Using Tableau: Another Way of Looking at Graduation Rates

Readers:

Jon Boeckenstedt (photo, right), who works in Enrollment Management for DePaul University, created this data bursting visualization using Tableau.

Jon’s thought processes on this and why he created the visualization he created are noted below.

What do you think of this visualization and as Jon asks: What do you see in the data?

Best Regards,

Michael

### Another Way of Looking at Graduation Rates

Jon saw an article in his Facebook feed about college ROI, although it was called the 50 Best Private Colleges for Earning Your Degree on Time. As is often the case, there was nothing really wrong with the facts of that article: You see a nice little table showing the 50 Colleges with the highest graduation rate.

But it got Jon thinking: What if high graduation rate wasn’t enough?  What if a considerable portion of your freshman class that graduates takes longer than four years to do so? Is that a good deal?  He then created some hypotheticals:

College A: 1000 freshmen, 800 who graduate within four years, 900 who graduate in five, and 950 who graduate in six.  So the four-, five-, and six-year graduation rates are 80%, 90%, and 95%.  But of the 950 who eventually graduate, only 84.2% do so in four years.

College B: 1000 freshmen, 750 who graduate within four years, 775 who graduate in five, and 800 who graduate in six.  So the four-, five-, and six-year graduation rates are 75%, 77.5%, and 80%. Thus, of the 800 who eventually graduate, almost 94% do so in four years.

College C: 1000 freshmen, 550 who graduate within four years, 600 who graduate in five, and 625 who graduate in six.  So the four-, five-, and six-year graduation rates are 55%, 60%, and 62.5%. Of the 625 who eventually graduate, 88% do so in four years.

If you were choosing among these three colleges, which might you choose?  The easy money says you go with College A, the one with the highest graduation rate. College B would be your second choice, and C would be your third.  But what if you are absolutely, positively certain you’ll graduate from the college you choose? College B is first, then College C, then College A.

Data can be tricky. Jon has noted many times in the past that things like graduation rates are really almost inputs, not outputs: If you choose wealthy, well-educated students, you’re going to have higher graduation rates.  It’s a classic case of making a silk purse out of, well, silk.

Jon tried to demonstrate this in the visualization he created below, and he likes the simplicity here.  Each dot is a college (hover over it for details).  They’re in boxes based on the average freshman ACT score across the top, and the percentage of students with Pell along the side.  The dots are colored by four-year graduation rates, and you should see right away the pattern that emerges.  Red dots (top right) tend to be selective colleges with fewer poor students.

But if you want to look at the chance a graduate will finish in four years, use the filter at the bottom right.  Find a number you like, pull the left slider up to it, and see who remains.  (Just a note: Jon is a little suspicious of any number of 100% on this scale, which would mean absolutely no students who graduate take longer than four years to do so.  It might be true, but it’s hard to believe. But he would set the right slider to 99% at the most.)  Jon points out to remember there is a lot of bad IPEDS data out there, so don’t place any bar bets on what you see here.

What do you see? Click on the image below and find out.

# DataViz Using Tableau: A History of Crayola Colors

Readers:

Here is a great dataviz from Tableau Public.

It was created by Stephen Wagner and was originally published on Analytics Wagner.

Stephen Wagner explores the evolution of Crayola colors, from 1903 until now.

Click on a crayon in the “Box of Colors” to learn the name of the color, how long it has been in production, and any additional facts.

Enjoy!

Michael

# Tapestry Conference 2015: Interesting Visualizations From Presentations (and more Odds and Ends) – Part 1

Readers:

More great information from the Tapestry Conference.

Enjoy!

Michael

### RJ Andrews, Info We Trust, Creative Routines

“We all have the same 24 hours that Beyoncé has” and its various iterations took the web by storm in late 2013 as the megastar became the figurehead of not only having it all, but being able to somehow do it all too.

How do creatives – composers, painters, writers, scientists, philosophers – find the time to produce their opus? Mason Currey investigated the rigid Daily Rituals that hundreds of creatives practiced in order to carve out time, every day, to work their craft. Some kept to the same disciplined regimen for decades while others locked in patterns only while working on specific works.

# Tapestry Conference 2015: Odds and Ends

Readers:

Had a great conference and want to share various odds and ends from the last two days.

Hope to post more in a day or two.

Enjoy!

Michael

### Sound Bites

Let us go forth and build double barreled cannons and deed trees to themselves. -Ellie Field

Revelation is based on prior knowledge. -Hannah Fairfield

Show what you know as well as what you don’t know. -Hannah Fairfield

All storytelling is manipulation. -Ken Burns

Do good with data. -Kim Rees

Data only has so much elasticity before it breaks down. -Kim Rees

# 2015 Gartner Magic Quadrant for Business Intelligence and Analytics Platforms – Tableau Wins Again

### Click on Image to Read the Report

Tableau’s intuitive, visual-based data discovery capabilities have transformed business users’ expectations about what they can discover in data and share without extensive skills or training with a BI platform. Tableau’s revenue growth during the past few years has very rapidly passed through the \$100 million, \$200 million and \$300 million revenue thresholds at an extraordinary rate compared with other software and technology companies.

Tableau has a strong position on the Ability to Execute axis of the Leaders quadrant, because of the company’s successful “land and expand” strategy that has driven much of its growth momentum. Many of Gartner’s BI and analytics clients are seeing Tableau usage expand in their organizations and have had to adapt their strategy. They have had to adjust to incorporate the requirements that new users/usage of Tableau bring into the existing deployment and information governance models and information infrastructures. Despite its exceptional growth, which can cause growing pains, Tableau has continued to deliver stellar customer experience and business value. We expect that Tableau will continue to rapidly expand its partner network and to improve international presence during the coming years.

# Bryan Brandow: Triggering Cubes & Extracts using Tableau or MicroStrategy

Bryan Brandow (photo, right), a Data Engineering Manager for a large social media company, is one of my favorite bloggers out their in regards to thought leadership and digging deep into the technical aspects of Tableau and MicroStrategy. Bryan just blogged about triggering cubes and extracts on his blog. Here is a brief synopsis.

One of the functions that never seems to be included in BI tools is an easy way to kick off an application cache job once your ETL is finished.  MicroStrategy’s Cubes and Tableau’s Extracts both rely on manual or time based refresh schedules, but this leaves you in a position where your data will land in the database and you’ll either have a large gap before the dashboard is updated or you’ll be refreshing constantly and wasting lots of system resources.  They both come with command line tools for kicking off a refresh, but then it’s up to you to figure out how to link your ETL jobs to call these commands.  What follows is a solution that works in my environment and will probably work for yours as well.  There are of course a lot of ways for your ETL tool to tell your BI tool that it’s time to refresh a cache, but this is my take on it.  You won’t find a download-and-install software package here since everyone’s environment is different, but you will find ample blueprints and examples for how to build your own for your platform and for whatever BI tool you use (from what I’ve observed, this setup is fairly common).  Trigger was first demoed at the Tableau Conference 2014.  You can jump to the Trigger demo here.

I recommend you click on the link above and give his blog post a full read. It is well worth it.

Best regards,

Michael

# DataViz: Squaring the Pie Chart (Waffle Charts)

Readers:

In the past, I would have highly condemned pie charts without giving you much explanation why. However, Dr. Robert Kosara (photo, left), posted a great thought study of pie charts on his wonderful blog, EagerEyes.org, that I want to share with you.

Dr. Kosara is a Visual Analytics Researcher at Tableau Software, with a special interest in the communication of, and storytelling with, data. He has a Ph.D. in Computer Science from Vienna University of Technology.

Also, as part of his blog post, Robert offers an alternative way to create pie charts: using waffle charts or square pie charts.

Dr. Kosara is also one of the great minds behind Tableau’s new storytelling feature. I hope you enjoy his creative thoughts as much as I do.

Best Regards,

Michael

### The Pie Chart

Dr. Kosara contends that pie charts are perhaps the most ubiquitous chart type; they can be found in newspapers, business reports, and many other places. But few people actually understand the function of the pie chart and how to use it properly. In addition to issues stemming from using too many categories, the biggest problem is getting the basic premise: that the pie slices sum up to a meaningful whole.

Robert points out that the circle (the “pie”) represents some kind of whole, which is made up of the slicesWhat this means is that the pie chart first and foremost represents the size relationship between the parts and the entire thing. If a company has five divisions, and the pie chart shows profits per division, the sum of all the slices/divisions is the total profits of the company.

If the parts do not sum up to a meaningful whole, they cannot be represented in a pie chart, period. It makes no sense to show five different occupations in a pie chart, because there are obviously many missing. The total of such a subsample is not meaningful, and neither is the comparison of each individual value to the artificial whole.

Slices have to be mutually exclusive; by definition, they cannot overlap. The data therefore must not only sum up to a meaningful whole, but the values need to be categorized in such a way that they are not counted several times. A good indicator of something being wrong is when the percentages do not sum up to 100%, like in the infamous Fox News pie chart.

### The Infamous Fox News Pie Chart

In the pie chart above, people were asked which potential candidates they viewed favorably, but they could name more than one. The categories are thus not mutually exclusive, and the chart makes no sense. At the very least, they would need to show the amount of overlap between any two (and also all three) candidates. Though given the size of the numbers and the margin of error in this data, the chart is entirely meaningless.

### When to Use Pie Charts

Dr. Kosara points out that there are some simple criteria that you can use to determine whether a pie chart is the right choice for your data.

• Do the parts make up a meaningful whole? If not, use a different chart. Only use a pie chart if you can define the entire set in a way that makes sense to the viewer.
• Are the parts mutually exclusive? If there is overlap between the parts, use a different chart.
• Do you want to compare the parts to each other or the parts to the whole? If the main purpose is to compare between the parts, use a different chart. The main purpose of the pie chart is to show part-whole relationships.
• How many parts do you have? If there are more than five to seven, use a different chart. Pie charts with lots of slices (or slices of very different size) are hard to read.

In all other cases, do not use a pie chart. The pie chart is the wrong chart type to use as a default; the bar chart is a much better choice for that. Using a pie chart requires a lot more thought, care, and awareness of its limitations than most other charts.

### Alternative: Squaring the Pie

A little-known alternative to the round pie chart is the square pie or waffle chart. It consists of a square that is divided into 10×10 cells, making it possible to read values precisely down to a single percent. Depending on how the areas are laid out (as square as possible seems to be the best idea), it is very easy to compare parts to the whole. The example below is from a redesign Dr. Kosara did a while ago about women and girls in IT and computing-related fields.

### Links to Examples of Waffle Charts

I did a little Googling and found a few great examples of Waffle Charts. I have provided links to examples in Tableau, jQuery R and Excel.

Sources:

# Jock Mackinlay and Tableau’s Research Team is Building Tomorrow’s UX for Data

Readers:

I thought I would present some interesting information visualization research being conducted at Tableau Software by Jock Mackinlay (photo, right) and his research team.

Mr. Mackinlay is an information visualization expert and Vice President of Visual Analysis at Tableau Software. With Stuart K. Card, George G. Robertson and others he invented a number of Information Visualization techniques. [1] Mr. Mackinlay, joined Tableau in 2004 after 18 years specializing in data visualization at Xerox PARC.

Tableau Software was born of academic research, and as the company continues to grow, it is building an R&D division to help build a pipeline of innovation. Jock, who heads up the research team, explains how it works and what his team is working on.

I cite references (most of this blog post is based on Derrick Harris’ interview with Mr. Mackinlay in Gigaom) after this blog post for those of you who want to delve deeper into what Jock’s team is doing.

Best regards,

Michael

### Tableau Software and Their Research Culture

Tableau Software is many things: a fast-growing thorn in the side of legacy analytics vendorsstock-market gold and the poster child for the next generation of user-friendly data analysis, among them. It’s also a company with a deeply rooted and growing research culture that’s responsible for nearly everything users see when they open its popular visualization application. [2]

Tableau itself is the product of a Stanford Ph.D. dissertation by co-founder and Chief Development Officer Chris Stolte, in conjunction with his then-professor and eventual co-founder Pat Hanrahan. Their project, called Polaris, combined a structured query language with a declarative language for describing data visualization. When they commercialized the research by founding Tableau, that combination – which came together into a technology called VizQL – became the defining feature of the drag-and-drop Tableau experience.

However, the true value of what Stolte and Hanrahan created wasn’t just that let it let mainstream users query data visually and generate graphs, said Mackinlay. There had been a lot of research around ideal ways to visualize data — including his own — but they often focused on customized views of a single problem or type of analysis.“The real power [of Tableau] was to go through a bunch of different views to answer one question,” Mackinlay said. “All you have to be an expert at is your data and the questions you want to ask of it.”

The new research division within Tableau (technically, it was really created about a year and a half ago) is trying to imagine and create the next set of technologies that change the way data analysis is done. The five-person team, which Mackinlay heads, consists of four visualization experts (including Mackinlay), a couple of whom are also specialize in statistics and one of whom specializes in high-performance computing. The fifth member specializes in natural-language processing and computer graphics.

Like most research divisions, the team writes academic papers and works on some projects that might not be applicable for years, but Mackinlay made it pretty clear that the researchers expect everything they’re doing could be commercialized. If there was one thing that separated the famous Bell Labs from Xerox PARC or even Microsoft Research, it’s that Bell was really good at doing really good research that made its way into products, he said. Good research labs need to find the middle ground between nearsighted product upgrades and pie-in-the-sky ideas and, he explained, “You have to have absolutely no gap between the research scientists … and the people who are actually doing the work.”

### Research Leads to Tableau Story Points Feature

It’s at a much, much smaller scale than Bell Labs, but Mackinlay thinks Tableau is following down that right path. For example, he said, the Story Points feature in the latest release of the company’s software, allows users to create data slideshows, was the result of tight work between the product team and researcher Robert Kosara (photo, right), who had been doing research into this area for years. As data volumes, dataset complexity and user sophistication all increase, Mackinlay said systems-level research into data processing (including how to optimize for increased client-side computing power) has and will continue to help deliver a smooth user experience.He’s understandably less forthcoming about what, specifically, we can expect to see from Tableau in the near term, but Mackinlay did discuss a few areas of interest. One is making it easier to use aesthetically pleasing icons rather than text labels in charts, an area where he and colleague Vidya Setlur (the aforementioned NLP and graphics specialist) recently published a paper. He’s also interested in text analysis and NLP, and generally adding new types of visualizations — some of which those types of analysis will help enable. For example, “node-link diagrams” (aka graphswill happen, he said, although he can’t put an exact data on when.

Mackinlay also suggested that Tableau might expand beyond its current product lineup, which is essentially the same software delivered via the desktop (free and paid), server or cloud. “We can make our existing products easy to use,” Mackinlay said. “We can also make new products that are easy to use — perhaps radically easier than our existing products.”

Although the word “easy” is kind of a misnomer, it’s one that’s used to describe Tableau and other user-friendly software quite often. “Easy” connotes shallowness, Mackinlay said, making an analogy to the evolution of the telephone. Phones have evolved a great deal from those where users just rang the operator, to rotary phones, and now to modern smartphones. With every iteration, manufacturers had to strike the right balance maintaining a recognizable experience but also adding more capabilities.

“We use the two words ‘simple’ and ‘useful,’” he said. “… If you don’t make sure you’re useful, people just aren’t going to stick with you.”

—————————————————————————–

References

[1]  Jock D. Mackinlay, Wikipedia.com, http://en.wikipedia.org/wiki/Jock_D._Mackinlay.

[2] Derrick Harris, A tiny research team at tableau is building tomorrow’s UX for data, Gigaom, July 7, 2014, http://gigaom.com/2014/07/07/a-tiny-research-team-at-tableau-is-building-tomorrows-ux-for-data/.

# Small Multiples, Tableau and Ben Jones

Readers:

My BI world is changing a bit as I move more towards using Cognos and Tableau at work. In particular, I have a lot of status reports and dashboards to create for my leadership and I have been doing these mostly in Tableau.

I had a situation recently where I wanted to create a small multiples chart versus using a 3D Bar Chart that already existed. I have created small multiples charts fairly easily in MicroStrategy in my previous work, but have never created one before in Tableau. I reached out to Ben Jones (photo, right) at Tableau. I have been a big fan of Ben’s DataRemixed blog for quite some time and have blogged about Ben many times in the past. Ben was gracious enough to create a simple example small multiples chart for me to use to accomplish what I wanted to visualize. I was really impressed that Ben and Tableau did not put me through any red tape for him to help me. He saw I had a need and he helped me.

Much thanks to Ben for his help and I hope this example is useful to you.

Best Regards,

Michael

### Small Multiples

A small multiple (sometimes called trellis chart, lattice chart, grid chart, or panel chart) is a series or grid of small similar graphics or charts, allowing them to be easily compared. The term was popularized by data visualization pioneer, Edward Tufte.

According to Tufte (Envisioning Information, p. 67):

At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs, multivariate and data bountiful, answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives. For a wide range of problems in data presentation, small multiples are the best design solution.

### A Small Multiples Example by Andrew Gelman

One of the most well-known examples of the use of small multiples is Andrew Gelman’s analysis of public support for vouchers, broken down by religion/ethnicity, income, and state (see image below).

Mr. Gelman is a professor of statistics and political science and director of the Applied Statistics Center at Columbia University. His books include Bayesian Data Analysis (with John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin), Teaching Statistics: A Bag of Tricks (with Deb Nolan), Data Analysis Using Regression and Multilevel/Hierarchical Models (with Jennifer Hill), Red State, Blue State, Rich State, Poor State: Why Americans Vote the Way They Do (with David Park, Boris Shor, and Jeronimo Cortina), and A Quantitative Tour of the Social Sciences (co-edited with Jeronimo Cortina).

Andrew has done research on a wide range of topics, including: why it is rational to vote; why campaign polls are so variable when elections are so predictable; why redistricting is good for democracy; reversals of death sentences; police stops in New York City, the statistical challenges of estimating small effects; the probability that your vote will be decisive; seats and votes in Congress; social network structure; arsenic in Bangladesh; radon in your basement; toxicology; medical imaging; and methods in surveys, experimental design, statistical inference, computation, and graphics.

[Click on Image to Enlarge]

### My Small Multiples Chart

Since I cannot show you what I used the small multiples chart for related to my job, I made an illustrative, simple example related to home sales in different regions for the past six months. Below is an example of my chart, which I created using Tableau.

[Click on Image to Enlarge]

### Adding Trend Lines

One of the key features I wanted to use in my chart was to be able to show trend lines for each small multiple.

However, when I clicked on Trend Lines -> Show Trend Lines, I kept getting the following error message:

Ben pointed out that in my original chart, the Columns shelf, Month needed to be a Continuous data type (green pill) rather than a Discrete data type (blue pill).  If you click in the Month pill, you should be able to select “Change to Continuous” and then you should be able to add a trend line. This occurs because you can only calculate a trend line when two axes are involved. The way I had it set up, the Columns were just different categories or attributes, rather than continuous measures.

I thought this would be a nice tip to pass along.

I hope to be able to share more Tableau tips as I become more proficient with the tool.