Recoding position-specific values in a df in R

infoart.ca
3 min readJun 16, 2022

--

Sometimes datasets contain value/s that are safer to recode by restricting the operation to that particular position/cell.

The advantage of this method is that it prevents messing up with the rest of the df especially when that value is present in multiple columns.

Let’s have a look at the following dataset where the value 5.1 appears 3 times at positions (4,1), (5,1) and (15,3):

> iris> iris
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.0 3.5 1.3 0.3 setosa
2 4.4 3.0 1.3 0.2 setosa
3 4.8 3.4 1.9 0.2 setosa
4 5.1 3.5 1.4 0.3 setosa
5 5.1 3.8 1.9 0.4 setosa
6 6.3 2.3 4.4 1.3 versicolor
7 6.1 2.9 4.7 1.4 versicolor
8 5.5 2.3 4.0 1.3 versicolor
9 6.6 3.0 4.4 1.4 versicolor
10 5.7 3.0 4.2 1.2 versicolor
11 6.7 3.0 5.2 2.3 virginica
12 6.3 3.4 5.6 2.4 virginica
13 7.4 2.8 6.1 1.9 virginica
14 7.7 3.0 6.1 2.3 virginica
15 6.9 3.1 5.1 2.3 virginica

# check frequency of 5.1
length(which(iris==5.1))3

And we want to convert that from the position of (5,1) to 5.5.

In general, we opt for the following approach and this works when the value occurs only once in the specified column:

iris$Sepal.Length[iris$Sepal.Length==5.1] <- 5.5

But the issue with this method is that it replaced all occurrences of 5.1 with 5.5:

> irisSepal.Length Sepal.Width Petal.Length Petal.Width    Species
1 5.0 3.5 1.3 0.3 setosa
2 4.4 3.0 1.3 0.2 setosa
3 4.8 3.4 1.9 0.2 setosa
4 5.5 3.5 1.4 0.3 setosa
5 5.5 3.8 1.9 0.4 setosa
6 6.3 2.3 4.4 1.3 versicolor
7 6.1 2.9 4.7 1.4 versicolor
8 5.5 2.3 4.0 1.3 versicolor
9 6.6 3.0 4.4 1.4 versicolor
10 5.7 3.0 4.2 1.2 versicolor
11 6.7 3.0 5.2 2.3 virginica
12 6.3 3.4 5.6 2.4 virginica
13 7.4 2.8 6.1 1.9 virginica
14 7.7 3.0 6.1 2.3 virginica
15 6.9 3.1 5.1 2.3 virginica

A neat way to do that for the position iris(1, 5) is:

> iris[5, "Sepal.Length"] <- 5.5> iris   Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1 5.0 3.5 1.3 0.3 setosa
2 4.4 3.0 1.3 0.2 setosa
3 4.8 3.4 1.9 0.2 setosa
4 5.1 3.5 1.4 0.3 setosa
5 5.5 3.8 1.9 0.4 setosa
6 6.3 2.3 4.4 1.3 versicolor
7 6.1 2.9 4.7 1.4 versicolor
8 5.5 2.3 4.0 1.3 versicolor
9 6.6 3.0 4.4 1.4 versicolor
10 5.7 3.0 4.2 1.2 versicolor
11 6.7 3.0 5.2 2.3 virginica
12 6.3 3.4 5.6 2.4 virginica
13 7.4 2.8 6.1 1.9 virginica
14 7.7 3.0 6.1 2.3 virginica
15 6.9 3.1 5.1 2.3 virginic

Or by column index:

> iris[5, 1] <- 5.5> irisSepal.Length Sepal.Width Petal.Length Petal.Width    Species
1 5.0 3.5 1.3 0.3 setosa
2 4.4 3.0 1.3 0.2 setosa
3 4.8 3.4 1.9 0.2 setosa
4 5.1 3.5 1.4 0.3 setosa
5 5.5 3.8 1.9 0.4 setosa
6 6.3 2.3 4.4 1.3 versicolor
7 6.1 2.9 4.7 1.4 versicolor
8 5.5 2.3 4.0 1.3 versicolor
9 6.6 3.0 4.4 1.4 versicolor
10 5.7 3.0 4.2 1.2 versicolor
11 6.7 3.0 5.2 2.3 virginica
12 6.3 3.4 5.6 2.4 virginica
13 7.4 2.8 6.1 1.9 virginica
14 7.7 3.0 6.1 2.3 virginica
15 6.9 3.1 5.1 2.3 virginic

For multiple rows:

iris[c(6, 7), ‘Sepal.Length’] <- 10

For strings, remember to use the as.character function:

iris[2, 5] <- as.character('violet')

--

--

infoart.ca
infoart.ca

Written by infoart.ca

Center for Social Capital & Environmental Research | Posts by Bishwajit Ghose, BI consultant and lecturer at the University of Ottawa

No responses yet