I am new in this forum, sorry for any issues... I have a dataframe (classification of substances with the classes) in the following format:
| A | B | C | D | |
|---|---|---|---|---|
| 1 | Organic compounds | Benzenoids | Benzene | NA |
| 2 | Organic compounds | Benzenoids | Benzene | NA |
| 3 | Organic compounds | Organic oxygen compounds | NA | NA |
| 4 | NA | NA | NA | NA |
| 5 | Organic compounds | Benzenoids | NA | NA |
At the end i need a dataframe with 2 columns. The result should be something like this:
| class | count |
|---|---|
| Organic compounds; Benzenoids; Benzene | 2 |
| Organic compounds; Organic oxygen compounds | 1 |
| Organic compounds; Benzenoids | 1 |
What is my first step? I tried to create a new column with the paste content of all the other columns like this:
df$class <- paste(df$A,df$B,df$C,df$D ,sep = "; ")
But the result is:
| class |
|---|
| Organic compounds; Benzenoids; Benzene; NA |
| Organic compounds; Benzenoids; Benzene; NA |
| Organic compounds; Organic oxygen compounds; NA; NA |
| NA; NA; NA; NA |
| Organic compounds; Benzenoids; NA; NA |
What would be a conceivable approach for this problem, to get the final result?
Thanks alot!