Is it possible to pd.merge() a df and csv when the df column is a list (could be more than one variable) and the csv only one.
df
GV2015_VAL polName
0 605000.0 [LENTEGEUR]
1 NaN [DURBANVILLE]
2 NaN [DURBANVILLE]
3 730000.0 [BISHOP LAVIS, GUGULETHU, MANENBERG]
4 625000.0 [LENTEGEUR]
csv
name m p j
0 LENTEGEUR 17.0 501.0 518.0
1 DURBANVILLE 10.0 495.0 505.0
2 BELHAR 9.0 352.0 361.0
3 MANENBERG 29.0 1013.0 1042.0
4 GUGULETHU 1.0 192.0 193.0
5 BISHOP LAVIS 10.0 495.0 505.0
name will match with polName.
Furthermore; the j parameter of csv should .aggregate (mean) when more than one variable present. So that the output for df should be:
df
GV2015_VAL polName merge_j
0 605000.0 [LENTEGEUR] 518.0
1 NaN [DURBANVILLE] 505.0
2 NaN [DURBANVILLE] 505.0
3 730000.0 [BISHOP LAVIS, GUGULETHU, MANENBERG] 580.0
4 625000.0 [LENTEGEUR] 518.0
How can the built-in .merge() handle the challenge or will looping / list comprehension be necessary?