我有一个要通过POI读取的电子表格(我同时具有xls和xlsx格式),但是在这种情况下,问题出在xls文件上。我的电子表格大约有10,000行75列,读入可能需要几分钟(尽管Excel会在几秒钟内打开)。我正在使用基于事件的读取,而不是将整个文件读取到内存中。我代码的内容如下。现在有点混乱,但这实际上只是一个很长的switch语句,大部分是从POI示例中复制的。
使用事件模型的POI性能如此缓慢通常吗?我有什么要加快速度的吗?我认为几分钟对于我的申请是不可接受的。
POIFSFileSystem poifs = new POIFSFileSystem(fis); InputStream din = poifs.createDocumentInputStream("Workbook"); try { HSSFRequest req = new HSSFRequest(); listener = new FormatTrackingHSSFListener(new HSSFListener() { @Override public void processRecord(Record rec) { thisString = null; int sid = rec.getSid(); switch (sid) { case SSTRecord.sid: strTable = (SSTRecord) rec; break; case LabelSSTRecord.sid: LabelSSTRecord labelSstRec = (LabelSSTRecord) rec; thisString = strTable.getString(labelSstRec .getSSTIndex()).getString(); row = labelSstRec.getRow(); col = labelSstRec.getColumn(); break; case RKRecord.sid: RKRecord rrk = (RKRecord) rec; thisString = ""; row = rrk.getRow(); col = rrk.getColumn(); break; case LabelRecord.sid: LabelRecord lrec = (LabelRecord) rec; thisString = lrec.getValue(); row = lrec.getRow(); col = lrec.getColumn(); break; case BlankRecord.sid: BlankRecord blrec = (BlankRecord) rec; thisString = ""; row = blrec.getRow(); col = blrec.getColumn(); break; case BoolErrRecord.sid: BoolErrRecord berec = (BoolErrRecord) rec; row = berec.getRow(); col = berec.getColumn(); byte errVal = berec.getErrorValue(); thisString = errVal == 0 ? Boolean.toString(berec .getBooleanValue()) : ErrorConstants .getText(errVal); break; case FormulaRecord.sid: FormulaRecord frec = (FormulaRecord) rec; switch (frec.getCachedResultType()) { case Cell.CELL_TYPE_NUMERIC: double num = frec.getValue(); if (Double.isNaN(num)) { // Formula result is a string // This is stored in the next record outputNextStringRecord = true; } else { thisString = formatNumericValue(frec, num); } break; case Cell.CELL_TYPE_BOOLEAN: thisString = Boolean.toString(frec .getCachedBooleanValue()); break; case Cell.CELL_TYPE_ERROR: thisString = HSSFErrorConstants .getText(frec.getCachedErrorValue()); break; case Cell.CELL_TYPE_STRING: outputNextStringRecord = true; break; } row = frec.getRow(); col = frec.getColumn(); break; case StringRecord.sid: if (outputNextStringRecord) { // String for formula StringRecord srec = (StringRecord) rec; thisString = srec.getString(); outputNextStringRecord = false; } break; case NumberRecord.sid: NumberRecord numRec = (NumberRecord) rec; row = numRec.getRow(); col = numRec.getColumn(); thisString = formatNumericValue(numRec, numRec .getValue()); break; case NoteRecord.sid: NoteRecord noteRec = (NoteRecord) rec; row = noteRec.getRow(); col = noteRec.getColumn(); thisString = ""; break; case EOFRecord.sid: inSheet = false; } if (thisString != null) { // do something with the cell value } } }); req.addListenerForAllRecords(listener); HSSFEventFactory factory = new HSSFEventFactory(); factory.processEvents(req, din);
我还对数千个大型excel文件进行了一些处理,我认为POI非常快。在Excel本身中,加载excel文件也花费了大约1分钟。所以我会确认问题出在POI代码之外