Converting numeric variables with a certain number of unique values to factors

infoart.ca
1 min readJan 28, 2023

--

When we import data to R using packages like foreign or haven, the factor columns generally get misread as numeric, for better or worse. Converting numeric columns to factors is fairly simple, as long as have a clear picture of the values in the target columns.

However, this can get challenging, especially when there are too many columns to handle. The situation can be eased to some extent by batch-converting the columns that contain a certain number of unique values that are unlikely to be numeric.

Let’s see an example using the mtcars data. Let’s first check the classes of the columns in the raw data:

sapply(mtcars, class)

col_types <- sapply(mtcars, class)
table(col_types)

col_types
numeric
11

As we can see, all the columns are currently marked as numeric. Now, let’s say we want to convert the columns with ≤5 unique values to factors.

 toFactor <- sapply(mtcars, function(x) {length(unique(x)) <= 5})

# Convert variables to factors
mtcars[, toFactor] <- lapply(mtcars[, vars_to_factor], as.factor)

And let’s check the classes one more time:

sapply(mtcars, class)

col_types <- sapply(mtcars, class)
table(col_types)

factor numeric
4 7

Looks like there were 4 columns with ≤5 unique values, which were now converted to factors.

A pretty handy method for busy days!

--

--

infoart.ca
infoart.ca

Written by infoart.ca

Center for Social Capital & Environmental Research | Posts by Bishwajit Ghose, BI consultant and lecturer at the University of Ottawa

No responses yet