1. Violin graph is like box plot, but better
Box-and-whisker plots are great. They show medians, ranges and variabilities effectively. They allow comparing groups of different sizes. They are super simple to create and read, so naturally, they are all over the place.
But box plots can be misleading. They are not affected by data’s distribution. When the data “morph” but manage to maintain their stat summaries (medians and ranges), their box plots stay the same.
Box plots can be misleading (Source: autodeskresearch.com)
This is when violin graphs, or violin plots, come to the rescue. A violin plotcarry all the information that a box plot would — it literally has a box plot inside the violin — but doesn’t fall into the distribution trap. How?
2. Violin graph is like density plot, but waaaaay better
The “violin” shape of a violin plot comes from the data’s density plot. You just turn that density plot sideway and put it on both sides of the box plot, mirroring each other.
Reading the violin shape is exactly how you read a density plot: the thicker part means the values in that section of the violin has higher frequency, and the thinner part implies lower frequency.
Why don’t we just use a density plot instead of violin plot? When there are too many groups (more than 3), their overlapping density plots become difficult to read. This doesn’t happen with violin plots, because…
3. Violin graph is visually intuitive and attractive
To compare different sets, their violin plots are placed side by side. They don’t sit on top of one another and don’t create the “uhm, what color is for what set again?” puzzle.
Violin plots are easy to read. The dot in the middle is the median. The box presents interquartile range. The whiskers show 95% confidence interval. The shape of the violin display frequencies of values. Different violins are different sets. You don’t even have to fill the chart with colors and patterns to distinguish sets, which actually makes it less distracting and easier to read. Even xkcd agrees.
4. Violin graph is non-parametric
Unlike bar graphs with means and error bars, violin plots contain all data points.This make them an excellent tool to visualize samples of small sizes. Violin plots are perfectly appropriate even if your data do not conform to normal distribution. They work well to visualize both quantitative and qualitative data.
5. There are many ways to use violin graphs
Honestly, who doesn’t want to wow their audiences with these plots?
Violin plot made easy with BioVinci
There is a way to make the oh-so-pretty violin plots without wrestling with the ggplot2 package. BioVinci is a software that lets you create violin plots (or any other kinds of plots) in just a few minutes. Check out this tutorial:
Questions about violin plots or BioVinci? Leave a comment, we’ll get back to you right away.