I have a dataframe called df that looks like this
c1 c2
A 1
A 2
A 3
B 1
I want column to find all rows where c1 has duplicate values, and keep only the row with the highest c2 value.
The result would look like this:
I have a dataframe called df that looks like this
c1 c2
A 1
A 2
A 3
B 1
I want column to find all rows where c1 has duplicate values, and keep only the row with the highest c2 value.
The result would look like this:
Since you only have the two columns in your data frame, the function aggregate can do this pretty easily:
aggregate(c2 ~ c1, data = df, FUN = max)
aggregate executes the function FUN on data based on the groups described on the right half of the input formula.