I have two data frames: df1 and df2. They both have four columns; three with the same names ID, Year and Week and one that are different from each other.
>df1
ID Year Oxygen Week
---- ------ ------- -------
1 2004 18 1
1 2005 17 1
2 2006 17 1
2 2007 18 1
3 2008 19 1
3 2010 20 1
3 2010 20 1
4 2012 16 1
5 2013 18 1
6 2014 18 1
>df2
ID Year Kg Week
---- ------ ----- -------
1 2004 20 1
1 2005 35 2
2 2006 30 2
3 2007 15 1
3 2008 70 2
4 2009 40 1
5 2013 55 1
6 2012 40 1
6 2014 10 2
7 2013 15 1
I would like to produce a new data frame which contains the rows from df1 only when the combination of ID and Year in df1 also are present in df2. The Week might be the same or not for that row, but I don't want to take the column Week into account. So the first row in df1 has 1 for ID and 2004 for Year which also occurs in df2. The combination of ID and Year for the second row in df1 does also occur in df2but have different value for Week.
I know how to do it if it only depends on one column:
df3 <- subset(df1, ID %in% df2$ID)
There was a solution for this when I didn't have the column Week which was:
df3 <- df1 %>% inner_join(df2)
But I don't know how to make it depend on both the ID and Year at the same time without it also takes Week into account.
I should end up with the following data frame, which only contain the columns from df1:
>df3
ID Year Oxygen Week
---- ------ ------- -------
1 2004 18 1
1 2005 17 1
2 2006 17 1
3 2008 19 1
4 2012 16 1
5 2013 18 1
6 2014 18 1