I have a data.frame ids with an index column of person ID number.
ID
PA1
PA2
PA3
PA4
PA5 etc
I want to extract the age and sex of each person from 2 other data.frames, df1 and df2which have the details (with different headers), each contains only part of the total sample population, though there maybe some overlap, but also ids contains more rows than both df1 and df2 combines:
df1
ID Sex Age Mode
PA1 M 34 HD
PA3 F 45 PD
and df2
ID Gender Age
PA4 M 67
PA3 F 45
PA2 M 65
In total there are about 1700 rows of data. I want to get the Sex or Gender and Age from df1 and df2 and put them with the correct ID in new columns in ids data.frame, so I get this:
ID Age Sex
PA1 34 M
PA2 65 M
PA3 etc...
PA4
PA5
df1 and df2 both contain other columns of data which I don't need in ids so I don't think I want to use left_join or merge.
I was considering doing something like this with dlplyr:
ids = ids%>%
mutate("Sex" = case_when(df1$Gender != "" ~df1$Gender, TRUE ~"" ))
and if blank to look in df2
but realise that won't match the values with the correct IDs. The IDs are not currently in the same order in the different data.frames