I have a data.table DT and I want to run model.matrix on it. Each row has a string ID, which is stored in the ID column of DT. When I run model.matrix on DT, my formula excludes the ID column. The problem is, model.matrix drops some rows because of NAs. If I set the rownames of DT to the ID column, before calling model.matrix, then the final model matrix has rownames, and I'm all set. Otherwise, I can't figure out what rows I end up with. I'm setting the rownames with rownames(DT) = DT$ID. However, when I try to add a new column to DT, I get a complaint about
"Invalid .internal.selfref detected . . . At an earlier point, this data.table has been copied by R."
So I'm wondering
- Is there a better way to set rownames for a
data.table - Is there a better approach to solving this problem.