quick forest plots in R
I have some unexplained fascination for the word forest, or anything called after that!
And if it’s about plotting, it better be nice and quick!
Let’s get to work:
Step 1: getting the data ready
id = c(1, 2, 3, 4, 5, 6) studies = c("Shelley et al", "Milton et al", "Snow et al", "Neruda et al", "Giovanni et al", "Li et al")rr = c(1.45, 3.41, 2.23, 2.18, 4.12, 1.93)lower = c(1.05, 2.41, 1.23, 1.18, 2.12, 0.93)upper = c(2.45, 4.63, 3.05, 3.21, 5.12, 2.97)ci = c("1.05, 2.45", "2.41, 4.63", "1.23, 3.05", "1.18, 3.21", "2.12, 5.12", "0.93, 2.97")df = data.frame(id, studies, rr, lower, upper, ci)# check the df
> df
id studies rr lower upper ci
1 1 Shelley et al 1.45 1.05 2.45 1.05, 2.45
2 2 Milton et al 3.41 2.41 4.63 2.41, 4.63
3 3 Snow et al 2.23 1.23 3.05 1.23, 3.05
4 4 Neruda et al 2.18 1.18 3.21 1.18, 3.21
5 5 Giovanni et al 4.12 2.12 5.12 2.12, 5.12
6 6 Li et al 1.93 0.93 2.97 0.93, 2.97
step 2: good old ggplot:
plot <- ggplot(df, aes(y = id, x = rr)) +
geom_point(shape = 18, size = 5) +
geom_errorbarh(aes(xmin = lower, xmax = upper), height = 0.25) + xlab('Risk ratios with 95%CI')
geting rid of the unnecessary details:
plot = plot + theme(axis.text.y=element_blank()) + ylab("")
and time for some visual desserts:
plot + theme(axis.text.y=element_blank(), legend.position="none") + ylab("") + geom_text(aes(label = studies, color = 'tomato', size = 3.5, hjust=-.5, vjust=-1))
and an optional reference line
plot + theme(axis.text.y=element_blank(), legend.position="none") + ylab("") + geom_text(aes(label = studies, color = 'tomato', size = 3.5, hjust=-.5, vjust=-1)) +geom_vline(xintercept = 1, color = "red", linetype = "dashed", cex = 1, alpha = 0.5)
For those who want to use symbols other than a diamond, try changing the values of the shape option. I like the filled circle which comes at shape 20:
ggplot(df, aes(y = id, x = rr)) +
geom_point(shape = 20, size = 5, color=’indianred’) +
geom_errorbarh(aes(xmin = lower, xmax = upper), height = 0.25) + xlab(‘Risk ratios with 95%CI’)
Can't get quicker than that, can it?!
And finally, here is a little more sophisticated one:
df <- data.frame(
variable = c("Age:<30", "30-39", "40-49", "50-59", "60-69", "70-79", "80+", "Sex: Male", "Female"),
oddsratio = c(1, 1.08, 1.11, 1.10, 1.14, 1.06, 0.78, 1,0.71), # Assuming the reference OR is 1
ci_lower = c(1, 1.01, 1.04, 1.03, 1.07, 0.99, 0.71, 1,0.69), # Replace NA with 0
ci_upper = c(1, 1.16, 1.19, 1.17, 1.22, 1.13, 0.85, 1,0.74) # Replace NA with 0
)
ggplot(df, aes(x = oddsratio, y = variable)) +
geom_point(aes(fill = '#FF0000', shape = variable != "Reference"), color = '#FF0000', size = 3) +
geom_errorbarh(aes(xmin = ci_lower, xmax = ci_upper, color = variable), height = 0.1, size = 0.2) + # Adjust the width using size
geom_vline(xintercept = 1, linetype = "dashed", color = "#345333", size = 0.2) + # Adjust the width using size
scale_color_manual(values = c(rep("#756443", nrow(result) - 1), "white")) +
scale_fill_manual(values = c(rep("white", nrow(result) - 1), "lightgray")) +
labs(title = "Forest Plot of Odds Ratios with 95% CI",
x = "Odds Ratio",
y = "Variable") +
theme_minimal() +
theme(panel.grid = element_blank(),
legend.position = "none")
Happy foresting!