PySpark Pivoting a dataframe

Question

So basically I have an input data frame as below

which I want to transform into below output

Can anyone please help me as to ho we can implement this using PySpark Dataframes ?

I tried different ways but could not find an optimal way to do the same

https://idownvotedbecau.se/noresearch/ – cruzlorite Aug 29 '23 at 20:16 — cruzlorite, Aug 29 '23 at 20:16

score 0 · Answer 1 · answered Aug 29 '23 at 16:25

0

Do a groupby on common columns and collect the column with distinct values into a list.

import pyspark.sql.functions as F

ans_df =  df.groupBy(F.col('HCP ID'), F.col('TERR ID')).agg(collect_list(F.col('PRODUCT')).alias("LINEUP"))

answered Aug 29 '23 at 16:25

user238607

1 Answers1