I'm writing a function read_list_if whose inputs are:
- a list
files_listof files to read - a function
read_functo read each file - and optionally a function
select_functo skip files which don't satisfy a certain boolean condition.
The full code is
read_func <- function(...){
read_csv(...,
col_types = cols(
.default= col_integer()),
col_names = TRUE)
}
read_list_if <- function(files_list, read_func, select_func = NULL, ...){
if (is.null(select_func)) {
read_and_assign <- function(dataset, read_func, ...){
dataset_name <- as.name(dataset)
dataset_name <- read_func(dataset, ...)
return(dataset_name)
}
} else
read_and_assign <- function(dataset, read_func, select_func, ...){
dataset_name <- as.name(dataset)
dataset_name <- read_func(dataset,...)
if (select_func(dataset_name)) {
return(dataset_name)
}
else return(NULL)
}
# invisible is used to suppress the unneeded output
output <- invisible(
sapply(files_list,
read_and_assign, read_func = read_func,
select_func = select_func, ...,
simplify = FALSE, USE.NAMES = TRUE))
}
library(readr)
files <- list.files(pattern = "*.csv")
datasets <- read_list_if(files, read_func)
Save the code in a script (e.g., test.R) in the same directory with at least one .csv file (even an empty one, created with touch foo.csv, will work). If you now source("test.R"), you get the error:
Error in read_csv(..., col_types = cols(.default = col_integer()), col_names = TRUE) :
unused argument (select_func = NULL)
The weird thing is that if there is no .csv file in the directory, then no error is produced. I guess this happens because, when the first argument to sapply, i.e. files_list, is an empty list, then the rest of the arguments are not evaluated (R lazy evaluation).