I have a dataframe like the following:
ID COL01_A COL01_B COL02_A COL02_B ... COL12_A COL12_B
1 01 19990101 03 20000101 ... FF ""
2 03 20170810 FA 20120303 ... "" ""
3 GG 19940508 DD 20000101 ... 03 20060808
4 03 20180101 09 20000101 ... "" ""
5 GF 20171212 03 19990101 ... 02 20190101
The values in the columns type A dictate wether the value in column type B is the one i'm looking for. In this case the interest is for values "03". There are twelve pairs of this columns. As seen in the example from COL01_A/COL01_B to COL12_A/COL12_B
I was looking for a way to generate a new column (lets call it COL_X) where the value of column type B is reflected only if it's twin column of type A has the "03" value. For the example presented above, the desired result would be something like this.
ID COL01_A COL01_B COL02_A COL02_B ... COL12_A COL12_B COL_X
1 01 19990101 03 20000101 ... FF "" 20000101
2 03 20170810 FA 20120303 ... "" "" 20170810
3 GG 19940508 DD 20000101 ... 03 20060808 20060808
4 03 20180101 09 20000101 ... "" "" 20180101
5 GF 20171212 03 19990101 ... 02 20190101 19990101
Right now i've solved my problem using a brutally long nested ifelse statement, which is not exactly readable nor is it a good practice (in my opinion). In terms of efficiency, it's fast, but i suppose it's only because the data is not too massive. I also found another solution using do.call(pmax(...)), But this solution requires me to clean the data frame (using ifelse statements) and creating an auxiliary dataframe with all the other information per row.
Is there a way to accomplish this in the least lines of code possible and/or not using auxiliary structures? If the solution uses data.table or dplyr it would be great.
Basic reproducible example:
ID <- c(1,2,3,4,5)
DATA <- c('xxx', 'yyy', 'zzz','xyz','zxy')
COL01_A<- c('01','03','GG','03','GF')
COL01_B<- c('19990101','20170810','19940508','20180101','20171212')
COL02_A<- c('03','FA','DD','09','03')
COL02_B<- c('20000101','20120303','20000101','20000101','19990101')
COL03_A<- c('FF','','03','','02')
COL03_B<- c('','','20060808','','20190101')
df <- data.frame(ID, DATA, COL01_A,COL01_B,COL02_A,COL02_B,COL03_A,COL03_B)
if there are several "03" values the COL_X should have ""