I have the following data frame
| user_id | value |
|---|---|
| 1 | 5 |
| 1 | 7 |
| 1 | 11 |
| 1 | 15 |
| 1 | 35 |
| 2 | 8 |
| 2 | 9 |
| 2 | 14 |
I want to drop all rows that are not the maximum value of every user_id
resulting on a 2 row data frame:
| user_id | value |
|---|---|
| 1 | 35 |
| 2 | 14 |
How can I do that?
I have the following data frame
| user_id | value |
|---|---|
| 1 | 5 |
| 1 | 7 |
| 1 | 11 |
| 1 | 15 |
| 1 | 35 |
| 2 | 8 |
| 2 | 9 |
| 2 | 14 |
I want to drop all rows that are not the maximum value of every user_id
resulting on a 2 row data frame:
| user_id | value |
|---|---|
| 1 | 35 |
| 2 | 14 |
How can I do that?
You can use pandas.DataFrame.max after the grouping.
Assuming that your original dataframe is named df, try the code below :
out = df.groupby('user_id', as_index=False).max('value')
>>> print(out)If you want to group more than one column, use this :
out = df.groupby(['user_id', 'sex'], as_index=False, sort=False)['value'].max()
>>> print(out)