I'm trying to aggregate data from multiple sources.
In the example below, I'd like to create a new data frame called RESULT. I need to replace the values from the cells in dt2 with the values from SCORE in dt1 based the ID and TASK variables.
dt1 <- read.table(text = "
ID TASK SCORE
1 task1 0
1 task2 0
2 task1 1
2 task2 0
3 task1 1
3 task2 0
4 task1 0
4 task2 1
5 task1 1
5 task2 0
6 task1 0
6 task2 1
", header = TRUE, stringsAsFactors = FALSE)
dt2 <- read.table(text = "
ID THING1 THING2 THING3 THING4
1 NA task2 NA task1
2 NA NA NA NA
3 task1 NA task2 NA
4 NA task1 NA task2
5 task2 NA task1 NA
6 NA task1 task2 NA
", header = TRUE, stringsAsFactors = FALSE)
RESULT <- read.table(text = "
ID THING1 THING2 THING3 THING4
1 NA 0 NA 0
2 NA NA NA NA
3 1 NA 0 NA
4 NA 0 NA 1
5 0 NA 1 NA
6 NA 0 1 NA
", header = TRUE, stringsAsFactors = FALSE)
For example, in dt1, we can see that ID = 1 received a score of 0 on task1.
From dt2, we can see that ID = 1 did THING4 for their first TASK.
As shown in the RESULT dataframe, ID = 1 now has a value of 0 for THING2.
I am not quite sure where to start. I've read through several threads but none of them (that I can find) solve this exact problem (for example here and here) and I'm not quite sure how to modify these examples to get the result I need because this kind of programming is very new to me. So I'd really appreciate any suggestions!