A quick guide to the mutate functions in dplyr
Data wrangling is an essential part of the data analysis process. In R, the dplyr
package provides a set of tools to make this task more efficient. Among these tools is the mutate
family of functions, offering versatile ways to add or modify columns in a data frame.
In this post , we will look into five key functions of the mutate
family - mutate_all
, mutate_at
, mutate_each
, mutate_each_
, and mutate_if
.
We will utilize the mtcars
and iris
datasets to see the practical applications of each function.
1. mutate_all
The mutate_all
function empowers you to apply a specified operation to all columns in a data frame. This is particularly useful when you want to uniformly transform every column.
library(dplyr)
mtcars %>%
mutate_all(~ . * 2) # Multiply all numeric columns by 2
In this example, we use mutate_all
to double the values of all numeric columns in the mtcars
dataset.
2. mutate_at
mutate_at
allows selective column manipulation by specifying the columns you want to transform. This is handy when you need to apply a function to specific columns.
mtcars %>%
mutate_at(vars(mpg, hp), funs(. * 2)) # Multiply 'mpg' and 'hp' columns by 2
Here, we use mutate_at
to double the values in the 'mpg' and 'hp' columns of the mtcars
dataset.
3. mutate_each
Unlike mutate_all
, mutate_each
provides flexibility by allowing you to apply a function to all columns while excluding specific ones.RCopy code
iris %>%
mutate_each(funs(log(.)), -Species) # Apply log transformation to numeric columns except 'Species'
This example showcases the use of mutate_each
to apply a logarithmic transformation to numeric columns in the iris
dataset, excluding the 'Species' column.
4. mutate_each_
mutate_each_
is a non-standard evaluation version of mutate_each
, enabling dynamic column selection through external variables.
columns_to_transform <- c("Sepal.Length", "Petal.Length")
iris %>%
mutate_each_(funs(log(.)), columns_to_transform)
In this illustration, we utilize mutate_each_
to apply a logarithmic transformation to specific columns determined by the columns_to_transform
vector.
5. mutate_if
mutate_if
shines when you want to apply a function based on a logical condition, selectively transforming columns that meet specific criteria.
mtcars %>%
mutate_if(is.numeric, funs(. * 2)) # Multiply all numeric columns by 2
Here, we use mutate_if
to double the values of numeric columns in the mtcars
dataset, based on the condition that the column is numeric.