I have 2 Dataframes:
df_1
A1 B1 C1 D1
0 4 64 57 51
1 7 1 14 69
2 47 21 56 47
3 49 52 36 87
4 74 39 7 54
5 96 16 32 44
df_2
A2 B2 C2 D2 E2
0 5 64 87 5 14
1 56 47 68 67 16
2 7 1 14 21 98
3 47 21 56 23 45
I want to check all values of columns A2 and B2, if they correspond to columns A1 et B1, then I create a new column D1 in dataFrame df_2 that correspond to the value of D1. If the values does not correspond to any rows in df_1 then I store Nan.
Here, values of A2 and B2 in rows 2 and 3 of df_2correspond to values of A1 and B1 in rows 1 and 2 of df_1.
Expected Output:
df_2
A2 B2 C2 D2 E2 D1
0 5 64 87 5 14 NaN
1 56 47 68 67 16 NaN
2 7 1 14 21 98 69
3 47 21 56 23 45 47
I'm working with 600 000 rows dataset, and I wonder if there is another way than using loop in df_2 to query df_1 to do this task?