I have a data frame that has a bunch of data that's joined with commas in certain elements of the rows. Something that looks like:
df <- data.frame(
c(2012,2012,2012,2013,2013,2013,2014,2014,2014)
,c("a,b,c","d,e,f","a,c,d,c","a,a,a","b","c,a,d","g","a,b,e","g,h,i")
)
names(df) <- c("year", "type")
I want to get it in a form that dcast is close to getting it to, with the year,a,b,c,etc being the columns, and the frequency across the data frame being in the cells of the resultant data frame. I tried first to do colsplit on df and then use dcast after, but that seems to only work if I want to aggregate on one of the levels instead of all.
df2 <- data.frame( df$year, colsplit(df$type, ',' , c('v1','v2','v3','v4','v5')) )
df3 <- dcast(df2, df.year ~ v1)
This result only gives me for the first level of the colsplit, instead of all of them. Am I close to a solution or should I be using a different approach entirely?