This is an example of my data
| ID | purchaseDate | numOfItemsPurchased |
|---|---|---|
| 12 | 12-10-2023 | 2 |
| 12 | 12-01-2023 | 34 |
| 56 | 24-03-2020 | 12 |
| 23 | 23-12-2012 | 1 |
| 23 | 23-05-2012 | 3 |
| 23 | 23-06-2012 | 4 |
| 24 | 12-10-2023 | 24 |
| 38 | 23-02-2012 | 21 |
| 16 | 12-10-2023 | 34 |
| 54 | 02-09-2020 | |
| 54 | 24-03-2020 | 19 |
I want to extract the highest and lowest purchaseNo for each 'ID' and the dates they were made. the expected result should look like:
| ID | maxPurchaseDate | maxPurchaseAmount | leastPurchaseDate | leastPurchaseAmount |
|---|---|---|---|---|
| 12 | 12-01-2023 | 34 | 12-10-2023 | 2 |
| 56 | 24-03-2020 | 12 | 24-03-2020 | 12 |
| 23 | 23-06-2012 | 4 | 23-12-2012 | 1 |
| 24 | 12-10-2023 | 24 | 12-10-2023 | 24 |
| 38 | 23-02-2012 | 21 | 23-02-2012 | 21 |
| 16 | 12-10-2023 | 34 | 12-10-2023 | 34 |
| 54 | 24-03-2020 | 19 | 02-09-2020 | 11 |
Please how do I get this done?
I tried this code:
details = data.groupby('ID').min()['purchaseDate']
But I get a KeyError: 'purchaseDate'. I have double checked to be sure the column was not misspelled. and I have made sure to fillna(0)
I also tried doing it this way:
details = data.groupby('ID').min()['numOfItemsPurchased']
It doesn't throw an error but I don't know how to add more columns to it
I have tried this code:
data.groupby(['ID']).agg({'purchaseDate': [np.min,np.max],
'numOfItemsPurchased' : [np.min,np.max]})
then I get a TypeError: '<=' not supported between instances of 'int' and 'str'