I have a data table that looks something like this:
library(data.table)
set.seed(1)
# Number of rows in the data table
obs <- 10^2
# Generate representative data
DT <- data.table(
V1 = sample(x = 1:10, size = obs, replace = TRUE),
V2 = sample(x = 11:20, size = obs, replace = TRUE),
V3 = sample(x = 21:30, size = obs, replace = TRUE)
)
And a vectorized function fn_calibrate that calculates an output variable V4 based on an input variable opt:
fn_calibrate <- function(opt) {
# Calculate some new value V4 that's dependent on opt
DT[, V4 := V1 * sqrt(V2) / opt ]
# Calculate the residual sum of squares (RSS) between V4 and a target value V3
DT[, rss := abs(V3 - V4)^2]
# Return the RSS
return(DT[, rss])
}
Now, I would like to perform a rowwise optimization using the optimize function, i.e. find the value of opt that minimizes the RSS for each row.
I was hoping to achieve that with the data.table by = syntax, such as:
# Run the optimizer rowwise
DT[, opt := optimize(f = fn_calibrate, interval = c(0.1, 1), tol = .0015)$minimum, by = seq_len(nrow(DT))]
The code returns the error invalid function value in 'optimize' because the fn_calibrate function is currently written (DT[, ...]) to return a whole vector of rss of length nrow(DT), instead of a scalar for just one row at a time.
My question is: is there a way to have fn_calibrate return rowwise results to the optimizer as well?
Edit
I realize a related question was asked and answered here in the context of a data frame, though the accepted answer uses a for loop whereas I would rather use the efficient data table by syntax, if possible. The RepRex above is simple (100 rows), but my actual data table is larger (250K rows).