A *.xlsx file is a ZIP archive containing the data of the Excel in a directory structure having different XML files.
For example there are
/xl/workbook.xml describing the basic workbook structure,
/xl/worksheets/sheet1.xml, /xl/worksheets/sheet2.xml, ...
/xl/worksheets/sheetN.xml having the sheet data - Here are the rows
and the cells but not all data within the cells are directly stored
there. Also the cell styles are not directly stored there. - ,
/xl/styles.xml which contains the cell styles,
/xl/sharedStrings.xml which contains all string content of cells in
all sheets. This is to avoid multiple storing the same string much
times if this string is used multiple times in cells.
So if you wants to read the *.xlsx ZIP archive, you needs unpacking the ZIP archive and then parsing at least the four XML files mentioned above to get the data for the XSSFWorkbook. This is what apache poi does while XSSFWorkbook wb = new XSSFWorkbook(fileinputstream);.
So if you really needs a XSSFWorkbook as the result, there is no way around this process. And if you not suspects that apache poi had programmed explicit delay routines, then there will not be a possibility to reduce the amount of time for this process.
Your approach only to read less rows than are stored into the sheet, could possibly be time saving. But then your result would be a XSSFWorkbook containing all the styles and all the string contents but only some sheet data related to those styles and string data. So it will lead to a partially broken XSSFWorkbook. Thats why nobody has really thought about this approach.
Only if the requirement is only to read the plain unformatted data from one of the /xl/worksheets/sheetN.xml without creating a XSSFWorkbook, then you only needs unpacking the ZIP archive and then parsing only the needed /xl/worksheets/sheetN.xml and the /xl/sharedStrings.xml to get the string content of the cells from. This would be possible in less time than the whole process described above.