Making creative barplots with ggpubr
Barplots are highly versatile and can be used to show comparisons, proportions, distributions, and trends among many other things.
Creating visually appealing barplots is critical in communicating data findings effectively. The ggplot2
package provides an excellent platform for making barplot, and luckily, there are a few other packages that come with exquisite recipes for people who are set to explore newer tastes all the time. In this blogpost, we will try the ggpubr
package that delivers both aesthetics and functionality in creating barplots. The feature I like the most in ggpubr is the ease in making ordered barplots by groups.
Let’s first create a fictitious dataset on state-wise population in India:
# states in India
states <- c("Andhra Pradesh", "Arunachal Pradesh", "Assam", "Bihar", "Chhattisgarh",
"Goa", "Gujarat", "Haryana", "Himachal Pradesh", "Jharkhand",
"Karnataka", "Kerala", "Madhya Pradesh", "Maharashtra", "Manipur",
"Meghalaya", "Mizoram", "Nagaland", "Odisha", "Punjab", "Rajasthan",
"Sikkim", "Tamil Nadu", "Telangana", "Tripura", "Uttar Pradesh",
"Uttarakhand", "West Bengal")
# population of each state
population <- c(53903393, 1229964, 35607039, 119461013, 29436231, 1540000,
63872399, 28204692, 7560770, 38593948, 67562686, 35699443,
85358965, 123144223, 3091545, 3366710, 1239244, 2249695,
46356334, 30141373, 81032689, 690251, 77841267, 39362732,
4169794, 237882725, 11082791, 91276115)
# ecological zone of each state
climate <- sample(c("Mountains", "Wetlands", "Plainland"), size = length(states), replace = T)
# combine the 3 columns into a single dataframe
india <- data.frame(states, population, climate)
And here is the head of the df:
head(india)
states population climate
1 Andhra Pradesh 53903393 Plainland
2 Arunachal Pradesh 1229964 Mountains
3 Assam 35607039 Mountains
4 Bihar 119461013 Wetlands
5 Chhattisgarh 29436231 Mountains
6 Goa 1540000 Wetlands
Let’s now make an ordered bar chart for population by states within the three ecological regions:
india %>% ggbarplot( x = "states", y = "population",
fill = "climate",
color = "white",
palette = "jco",
sort.val = "asc",
sort.by.groups = TRUE,
x.text.angle = 60,
ylab = "Population",
xlab = F,
)
# Where:
x = "states": The x-axis of the plot shows the states in India.
y = "population": The y-axis of the plot shows the population of each state.
fill = "climate": The bars are filled with colors corresponding to the climate of each state.
color = "white": The borders of the bars are white.
palette = "jco": The colors used for the fill of the bars are chosen from the "jco" color palette.
sort.val = "asc": The bars are sorted in ascending order based on their height.
sort.by.groups = TRUE: The bars are sorted within each climate group.
x.text.angle = 60: The text on the x-axis is tilted at a 60-degree angle.
ylab = "Population": The y-axis label is "Population".
xlab = F: The x-axis label is removed.
Our next goal is to make a barplot for states with populations above and below the mean. To do that, we need an extra step to create a dummy variable for the mean population.
india <- india %>%
mutate(population_z = (population - mean(population)) / sd(population),
pop_grp = factor(ifelse(population_z < 0, "low", "high"),
levels = c("low", "high")))
The output:
And here is the code:
ggbarplot(india, x = "states", y = "population_z",
fill = "pop_grp",
color = "white",
palette = "jco",
sort.val = "asc",
sort.by.groups = F,
x.text.angle = 90,
ylab = "Population z-score",
xlab = FALSE,
legend.title = "Population Group"
)
We can additionally try rotating the axes by specifying rotate= TRUE
ggbarplot(india, x = "states", y = "population_z",
fill = "pop_grp",
color = "white",
palette = "jco",
sort.val = "asc",
sort.by.groups = F,
x.text.angle = 90,
ylab = "Population z-score",
xlab = F,
legend.title = "Population Group",
rotate = T
)
Time for some dessert: Lollipop charts
ggdotchart(india, x = "states", y = "population",
color = "climate",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
sorting = "ascending",
add = "segments",
ggtheme = theme_pubr()
)
Lastly, we will make the chart more informative by ordering the lollipops within the ecological zones and increasing the size of the circles to show the values inside.
ggdotchart(india, x = "states", y = "population_z",
color = "climate",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
sorting = "descending",
add = "segments",
add.params = list(color = "lightgray", size = 2),
group = "climate",
dot.size = 6,
label = round(india$population_z,1),
font.label = list(color = "white", size = 9,
vjust = 0.5), ggtheme = theme_pubr()
) + geom_hline(yintercept = 0, linetype = 2, color = "lightgray")
Happy plotting!