Adding a Secondary Axis in a ggplot2 Plot was Easier than I Thought!
It was a piece of cake, and made me a happy camper!

Adding a Secondary Axis in a ggplot2 Plot was Easier than I Thought!

**This post contains affiliate links and I will be compensated if you make a purchase after clicking through my links.

My colleague, fintech analyst Josie Haywood of Wiseiye, Inc., and I decided we’d try to attract the attention of the Massachusetts Gaming Commission (MCG) and the three Massachusetts Casinos – Plainridge Park Casino, MGM Springfield, and Encore Boston Harbor – to contracting with us for big data analytics.

Yes, we dream of casino data! Don’t you?

If you want to read our white paper, just connect with me on LinkedIn and send me your e-mail address, and I’ll e-mail it to you. I summarized the highlights on my blog. What I wanted to talk about here is one of the plots I made using R’s ggplot2 package that we included in our white paper. 

If you want to see how I did all the six plots I did in the white paper, join me today on my YouTube channel at 1:00 pm ET and I’ll go through all my code live with you as my audience!

Make sure you subscribe to my channel so you get e-mails when I start livestreaming.

The reason I picked out this plot to show you is because it has a secondary axis on it. Let’s start with looking at the data I used.

First, here is a snippet of the data I was plotting:

No alt text provided for this image

The GGR2018 is the gross gaming revenue in 2018 reported by each state. I got the information from a report by the American Gaming Association. Pop2018 is the population in 2018 of the state that I got from the census. There are actually 24 states in the dataset – including Massachusetts.

I realized I needed five colors for the plot. I started at the Coolors app to pick hex colors. Then I mapped the hex colors to names that I could recognize.

hex_grey <- c("black")

default_grey <- c("grey50")

hex_blue <- c("#0079AD")

hex_red <- c("#CA4318")

hex_green <- c("#689700")

bar_color_vals <- c(hex_grey, hex_blue, hex_red)

dot_color_vals <- c(default_grey, hex_blue, hex_red)

As you can see, I first set variables by naming the hex color with a name. I learned that in R, grey50 is the default grey. I thought I might use hex_grey, so I created that variable along with default_grey. Now, I could speak in English in ggplot code about my colors.

As you know, this post is about a secondary axis, so that means that I need to be plotting two things. I decided to plot a set of bars and a set of dots on the same plot, so I created a vector bar_color_vals and dot_color_vals to be used for these two sets of data.

Creating the Plot with the Secondary Axis

So here’s the code for the plot:

p <- ggplot(data=Bar_plot, aes(x=reorder(StateAB, -GGR2018), y=GGR2018/1000000, fill=GGR_Level)) +

            geom_bar(stat="identity", position="dodge") +

            geom_point(data=Bar_plot, aes(x=reorder(StateAB, -GGR2018),

                        y=Pop2018/1000, fill=GGR_Level, color=GGR_Level)) +

            geom_hline(yintercept = Mass_Pop, color = hex_red) +

            scale_color_manual(values = dot_color_vals) +

            scale_fill_manual(values = bar_color_vals) +

            scale_y_continuous(labels = dollar, sec.axis = sec_axis(~./1000, name = "2018 Population (in millions)")) +

            ylab("2018 Total GGR (in millions)") +

            xlab("State")

p + theme_classic() +

            guides(colour = guide_legend(override.aes = list(shape = NA))) +

            theme(axis.text.x = element_text(size = 12),

                        axis.text.y = element_text(size = 12),

                        legend.text = element_text(size = 12))

...and here's the resulting plot:

No alt text provided for this image

Decoding the ggplot2 Code for a Secondary Axis

Here’s what you need to know to decode how the code matches the plot:

  • Bar_plot: This is the dataset being plotted.
  • StateAB: This is the abbreviation for the state (e.g., NV, PA, NJ) as shown in table above. As you can see, these labels end up on the x-axis.
  • GGR2018: This is the gross gaming revenue (GGR) in dollars from 2018 as shown in table above. This is what is on the y-axis. Notice the use of the reorder option, and the negative sign on GGR2018 to make it sort from highest to lowest value. Also notice that y=GGR2018/1000000 to scale it to be in millions. The left y-axis is labeled this way. In ggplot2, the first y-axis you declare is going to be on the left.
  • GGR_Level: This is the variable that is in the legend, which describes the categories. The fill option tells R to fill the bars according to this factor variable.
  • Pop2018: This is the 2018 population of the state as shown in the table. You will notice in the geom_point code line, the 2018 population is divided by 1,000. This has to do with the secondary axis (see below).
  • dot_color_vals and bar_color_vals: These are vectors of hex colors set up earlier.
  • scale_y_continuous: This is where we declare the secondary axis. Notice the labels command that formats the y-axis as “dollar” applies to the left and first y-axis declared. All the secondary axis code is in this line: sec.axis = sec_axis(~./1000, name = "2018 Population (in millions)")). The ~ indicates we are technically modifying the y-axis on the left (and it really doesn’t have anything to do with the dots we are plotting). We have to divide the axis by 1,000 to scale it to be on the chart. But that doesn’t mean it doesn’t impact the dots I was plotting, because however you modify the axis on the right, it pays attention to the y’s plotted on the right and replots them. Trust me, I experimented with this. If your denominator is too small, the dots stick to the bottom of the bars, and if it is too big, they fly off the page. Picking that dividing denominator is actually the tricky part of the secondary axis – not the actual command.

As Easy as This Was, it Was Still a Little Kludgy

I just want to make one more point. Being a plot snob, I wanted the legend to say GGR Level instead of GGR_Level (which it is reprinting from the data). But the reason I did not change this is that I struggled with finding a way to have R just print one legend (for the bars, not the points). I achieved this with the guides(colour = guide_legend(override.aes = list(shape = NA))) command at the end applied after the theme. But if I tried to add a legend title to this, it duplicated the legend again and did not do the override. That’s why I just gave up and left the underscore in.

So that's your data science makeover for today! Don't forget to visit my blog and join me for my live stream at 1:00 pm ET today!

Monika Wahi is an epidemiologist and data scientist. Check out her LinkedIn Learning courses on R, SAS, and study design with big data!


Hello! Is there any way to add secondary axis which is independent of left one? Is it possible? Could anyone help me? Thank you.

To view or add a comment, sign in

More articles by Monika Wahi

Others also viewed

Explore content categories