My question is about how to specify the class for various columns when reading in data that come from many files. More specifically, I am uploading 1000s of .xlsx files at a time and converting them to .csv files using the read.xls() function in the gdata package.
My approach is as follows:
Myfiles<-list.files() # lists all files in working directory (which contains data files)
library(gdata)
Mylist <- lapply(Myfiles, read.xls, header=T,
perl="C:/Users/A/PERL/perl/bin/perl.exe",
sheet=1,
method="csv",
skip=1,
as.is=1)
I apologize for not providing a workable example. I'm not sure how to do so for this problem.
All the .xlsx files have identical headers and set-up, but the classes of corresponding columns in the data frames within Mylist are not all the same. Is there a way to specify the classes within the lapply() approach I am using? I know you can extend functions of read.table() to read.xls() but I haven't figured out how to specify the column classes properly within the lapply call.