Plot multi column data with ggplot

ggplot is a great visualization tool for R. It draws beautiful plots but the difference from the native plotting system in R takes some time to get used to it.

Here are two examples how to plot data in multiple columns.
The original data have three columns with one x-variable and two y-variables. The data look like this.

First, I can use separate geom for each column.

This was the first approach and the result is below.

Plotting was easy but I had to spend quite a bit of time to figure out how to change the color and put the legend. Check Hadley’s answer for this. How to change the legend title

Then I found there is another way of doing it. It involves reshaping the data using melt() came with reshape package.

After reshaping by melt, the data look like this.

Then the data can be plotted with one geom.

With this method, ggplot took care of colors and legend automatically. Cool! Here is the result.

As you can see the two results are almost identical except for the label in the legend. The label follows the column name of the data. I have a feeling, also from several comment online, that ggplot prefers long table to wide table or one column for y variable. If you want this approach, melt will be a invaluable tool and ggplot takes care of many formatting jobs so that user can save lots of time.

Update: I found a way to change the labels for the legend.

Plot multi column data with ggplot ggplot is…
Tagged on:     

Leave a Reply

Your email address will not be published. Required fields are marked *