My question is similar to this one. Let's say I have two dataframes as shown below:
set.seed(5)
df1 <- data.frame(date=as.Date(c('2001-01-01', '2001-02-01', '2001-03-01', '2001-04-01', '2001-05-01')),
val1=rnorm(5), val2=c(rnorm(3), NA, NA), val3=c(rnorm(3), NA, NA))
df2 <- data.frame(date=as.Date(c('2001-03-01', '2001-04-01', '2001-05-01')),
val2=rnorm(3), val3=rnorm(3))
df1
date val1 val2 val3
1 2001-01-01 -0.84085548 -0.6029080 -0.2857736
2 2001-02-01 1.38435934 -0.4721664 0.1381082
3 2001-03-01 -1.25549186 -0.6353713 1.2276303
4 2001-04-01 0.07014277 NA NA
5 2001-05-01 1.71144087 NA NA
df2
date val2 val3
1 2001-03-01 -0.8017795 -1.0717600
2 2001-04-01 -1.0803926 -0.1389861
3 2001-05-01 -0.1575344 -0.5973131
df1 has some missing values in the columns val2 and val3. I want to replace only the missing values in those columns with correct match, match according to date from df2. From the question that I linked to, here's a potential solution using data.table:
library(data.table)
setDT(df1)
setDT(df2)
df1[i = df2, val2 := i.val2, on = .(date)]
df1
date val1 val2 val3
1: 2001-01-01 -0.84085548 -0.6029080 -0.2857736
2: 2001-02-01 1.38435934 -0.4721664 0.1381082
3: 2001-03-01 -1.25549186 -0.8017795 1.2276303
4: 2001-04-01 0.07014277 -1.0803926 NA
5: 2001-05-01 1.71144087 -0.1575344 NA
The problem here is that it has replaced the 3rd value of val2 in df1 with the matching value in df2, even though the 3rd value in df1 was not originally missing. Also, how can I fill all missing values in df1 with their matching values in df2 instead of doing it one column at a time?