I have two dataframes df_a with two columns colA, colB, and df_b with one column colA.
df_a <- data.frame(colA = sample(1:10, 10), colB = sample(LETTERS[1:20],10))
> df_a
colA colB
1 2 F
2 8 J
3 5 G
4 9 A
5 10 R
6 4 N
7 7 D
8 1 B
9 3 Q
10 6 H
df_b <- data.frame(colA = sample(1:10, 10))
> df_b
colA
1 9
2 5
3 3
4 7
5 1
6 8
7 2
8 4
9 10
10 6
I have to create a new column colB in df_b after comparing colA of df_a with colA of df_b.
> df_b$colB <- df_a[df_a$colA %in% df_b$colA,'colB']
> df_b
colA colB
1 9 F
2 5 J
3 3 G
4 7 A
5 1 R
6 8 N
7 2 D
8 4 B
9 10 Q
10 6 H
The corresponding values in both dataframes are not the same. For example in df_a, colA value 9 has A in colB. Whereas in df_b, colA value 9 has F in colB. Is this issue due to unsorted dataframes ?
Note: I couldn't find a similar question and this even might be a possible duplicate. I would like to understand the root cause of the error.
Original task was to populate values for replacing NA in df_b.
df_a <- data.frame(colA = sample(1:10, 10), colB = sample(LETTERS[1:10],10))
df_b <- data.frame(colA = sample(1:10, 10), colB = sample(c(LETTERS[1:10], 'NA'),10))