[翻译]  Apache POI - FileInputStream works, File object fails (NullPointerException)

[CHINESE]  Apache POI - FileInputStream工作,File对象失败(NullPointerException)


I try to copy all worksheets from one workbook to another workbook. The thing is, it works normally if I read the workbooks via FileInputStreams, but it does not work with File Objects.

我尝试将所有工作表从一个工作簿复制到另一个工作簿。问题是,如果我通过FileInputStreams读取工作簿,它会正常工作,但它不适用于文件对象。

Consider the following method:

考虑以下方法:

import java.io.BufferedReader;
import java.io.File;
import java.io.FileFilter;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import java.util.ArrayList;
import java.util.Iterator;

import org.apache.commons.io.IOUtils;
import org.apache.commons.io.filefilter.WildcardFileFilter;
import org.apache.poi.EncryptedDocumentException;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.SpreadsheetVersion;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.CellStyle;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.usermodel.DataConsolidateFunction;
import org.apache.poi.ss.usermodel.DateUtil;
import org.apache.poi.ss.usermodel.Font;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.apache.poi.ss.util.AreaReference;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.xssf.usermodel.XSSFPivotTable;
import org.apache.poi.xssf.usermodel.XSSFSheet;


public void copyAllSheetsAcrossWorkbook(String oldWorkbook, String newWorkbook)
        throws EncryptedDocumentException, InvalidFormatException, IOException {
    FileInputStream fisOld = null;
    FileInputStream fisNew = null;
    Workbook oldWB = null;
    Workbook newWB = null;
    FileOutputStream fileOut = null;

    System.out.println("oldWorkbook: " + oldWorkbook);
    System.out.println("newWorkbook: " + newWorkbook);
    fisOld = new FileInputStream(oldWorkbook);
    fisNew = new FileInputStream(newWorkbook);

    // THIS WORKS
    // oldWB = WorkbookFactory.create(fisOld);
    // newWB = WorkbookFactory.create(fisNew);

    // THIS DOES NOT WORK
    oldWB = WorkbookFactory.create(new File(oldWorkbook));
    newWB = WorkbookFactory.create(new File(newWorkbook));

    if (newWB == null) {
        System.out.println("newWB is null");
    }
    // CellStyle newStyle = newWB.createCellStyle();
    Row row;
    Cell cell;
    copiedSheets = new ArrayList<String>();
    for (int i = 0; i < oldWB.getNumberOfSheets(); i++) {
        XSSFSheet sheetFromOldWB = (XSSFSheet) oldWB.getSheetAt(i);
        String sheetNameFromOldWB = sheetFromOldWB.getSheetName();
        XSSFSheet sheetForNewWB = (XSSFSheet) newWB.getSheet(sheetNameFromOldWB);
        if (sheetForNewWB != null) {
            int sheetIndex = newWB.getSheetIndex(sheetNameFromOldWB);
            newWB.removeSheetAt(sheetIndex);
        }
        LOGGER.info("Copying to new Workbook: " + sheetNameFromOldWB);
        sheetForNewWB = (XSSFSheet) newWB.createSheet(sheetFromOldWB.getSheetName());
        for (int rowIndex = 0; rowIndex < sheetFromOldWB.getPhysicalNumberOfRows(); rowIndex++) {
            row = sheetForNewWB.createRow(rowIndex);
            for (int colIndex = 0; colIndex < sheetFromOldWB.getRow(rowIndex).getPhysicalNumberOfCells(); colIndex++) {
                cell = row.createCell(colIndex);
                // get cell from old WB's sheet and when cell is null, return as blank cells.
                Cell c = sheetFromOldWB.getRow(rowIndex).getCell(colIndex, Row.MissingCellPolicy.CREATE_NULL_AS_BLANK);

                // Below is where all the copying is happening.
                // CellStyle origStyle = c.getCellStyle();
                // newStyle.cloneStyleFrom(origStyle);
                // cell.setCellStyle(newStyle);
                switch (c.getCellTypeEnum()) {
                case STRING:
                    cell.setCellValue(c.getRichStringCellValue().getString());
                    break;
                case NUMERIC:
                    if (DateUtil.isCellDateFormatted(cell)) {
                        cell.setCellValue(c.getDateCellValue());
                    } else {
                        cell.setCellValue(c.getNumericCellValue());
                    }
                    break;
                case BOOLEAN:
                    cell.setCellValue(c.getBooleanCellValue());
                    break;
                case FORMULA:
                    cell.setCellFormula(c.getCellFormula());
                    break;
                default:
                    break;
                }
            }
        }
        copiedSheets.add(oldWB.getSheetName(i));

    }
    fileOut = new FileOutputStream(newWorkbook);
    newWB.write(fileOut); // <------ HERE I GET NULLPOINTEREXCEPTION
    fisOld.close();
    fisNew.close();
    oldWB.close();
    fileOut.close();
    newWB.close();

I get the following exception at newWB.write(fileOut);:

我在newWB.write(fileOut);得到以下异常:

Exception in thread "main" org.apache.poi.POIXMLException: java.lang.NullPointerException
at org.apache.poi.POIXMLDocument.getProperties(POIXMLDocument.java:168)
at org.apache.poi.POIXMLDocument.write(POIXMLDocument.java:246)
at com.capgemini.toolkit.App.copyAllSheetsAcrossWorkbook(App.java:263)
at com.capgemini.toolkit.App.main(App.java:58)

Caused by: java.lang.NullPointerException
at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.read(ZipSecureFile.java:210)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLVersionDetector.determineDocVersion(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at org.apache.poi.util.DocumentHelper.readDocument(DocumentHelper.java:140)
at org.apache.poi.POIXMLTypeLoader.parse(POIXMLTypeLoader.java:143)
at org.openxmlformats.schemas.officeDocument.x2006.extendedProperties.PropertiesDocument$Factory.parse(Unknown Source)
at org.apache.poi.POIXMLProperties.<init>(POIXMLProperties.java:78)
at org.apache.poi.POIXMLDocument.getProperties(POIXMLDocument.java:166)
... 3 more

In the POI documentation, it is always mentioned to better use a File object due to lower memory consumption. That's why I'm wondering why it does not work with a File object .

在POI文档中,由于内存消耗较低,因此总是提到更好地使用File对象。这就是为什么我想知道为什么它不适用于File对象。

For testing, this is the only method which is running in the main method and I used 2 fresh Excel files (.xlsx) with some dummy data.

为了测试,这是在main方法中运行的唯一方法,我使用了2个新的Excel文件(.xlsx)和一些虚拟数据。

Does anyone see why it does not work with a File object ? Am I doing something wrong?

有谁知道为什么它不能与File对象一起使用?难道我做错了什么?

FYI: I'm using POI 3.16.

仅供参考:我正在使用POI 3.16。

2 个解决方案

#1


7  

Using a File instead of a FileInputStream for opening a Workbook leads to a lower memory footprint because then, in case of XSSF(*.xlsx), the ZipPackage will be opened from the *.xlsx file directly instead reading the whole ZIP content into the memory.

使用File而不是FileInputStream打开工作簿可以减少内存占用,因为在XSSF(*。xlsx)的情况下,ZipPackage将直接从* .xlsx文件打开,而不是将整个ZIP内容读入记忆。

But this also means, that the ZipPackage gets the file opened until the Workbook will be closed. So until the Workbook will be closed, nothing can write to that file the same time. So, since there is not a possibility to write the Workbook content back to the same file from where the Workbook was opened from, using a File instead a FileInputStream for opening a Workbook is fine if you wants only reading from that Workbook then. But it does not work if you wants reading from and writing to the same file. Then FileInputStream and FileOutputStream is needed.

但这也意味着,ZipPackage会在Workbook关闭之前打开文件。因此,在关闭工作簿之前,没有任何东西可以同时写入该文件。因此,由于无法将工作簿内容写回到打开工作簿的同一文件,因此如果您只想从该工作簿中读取,则使用File而不是FileInputStream来打开工作簿是可以的。但是如果你想读取和写入同一个文件,它就不起作用。然后需要FileInputStream和FileOutputStream。

So in your case you tries reading the Workbook newWB from a File and then writing the Workbook into the same file using

因此,在您的情况下,您尝试从文件中读取工作簿newWB,然后使用将工作簿编写到同一文件中

fileOut = new FileOutputStream(newWorkbook);
newWB.write(fileOut);

while the file is opened already. This fails.

而文件已经打开。这失败了。

But:

但:

   fisNew = new FileInputStream(newWorkbook);
   oldWB = WorkbookFactory.create(new File(oldWorkbook));
   newWB = WorkbookFactory.create(fisNew);
...
   fileOut = new FileOutputStream(newWorkbook);
   newWB.write(fileOut);

   oldWB.close();
   newWB.close();

should work.

应该管用。

Btw.: If you are using a File, then you should not using a FileInputStream for the same file. So don't use fisOld.

顺便说一句:如果您使用的是文件,则不应将FileInputStream用于同一文件。所以不要使用fisOld。

Another disadvantage of using a File instead of a FileInputStream for opening a Workbook is that while closing the Workbook and so implicitly closing the underlaying file system (POIFSFileSystem in case of HSSF and ZipPackage in case of XSSF) the file gets an updated last modified date. There are no changings made into the file but the file had been opened and new written into the file system. That's why the last modified date is updated.

使用File而不是FileInputStream打开工作簿的另一个缺点是,在关闭工作簿并因此隐式关闭底层文件系统时(如果是XSF,则为HSSF和ZipPackage时为POIFSFileSystem),文件将获得更新的最后修改日期。文件中没有进行任何更改,但文件已打开,新文件已写入文件系统。这就是更新上次修改日期的原因。


Edit Sep 21 2017: The disadvantage of using a File seems to be greater than thought first. OPCPackage.close also saves all changings into the underlaying OPCPackage. So if you are opening a XSSFWorkbook from a file and then wants writing the changings into another file using write(java.io.OutputStream stream), then the source file will also be changed while closing the OPCPackage. The problem only occurs if write(java.io.OutputStream stream) is used from XSSFWorkbook since then POIXMLDocument.write is called which calls POIXMLDocumentPart.onSave which "Saves changes in the underlying OOXML package.". So the OPCPackage is updated with all changings before closing.

编辑2017年9月21日:使用文件的缺点似乎比首先想象的要大。 OPCPackage.close还将所有变更保存到底层OPCPackage中。因此,如果您要从文件中打开XSSFWorkbook,然后想要使用write(java.io.OutputStream流)将变换写入另一个文件,那么在关闭OPCPackage时也会更改源文件。如果从XSSFWorkbook使用write(java.io.OutputStream流),那么只会调用POIXMLDocument.write,调用POIXMLDocumentPart.onSave“保存底层OOXML包中的更改”。因此OPCPackage会在关闭前更新所有更改。

Short Example:

简短示例:

import org.apache.poi.ss.usermodel.*;

import java.io.File;
import java.io.FileOutputStream;

class ReadAndWriteExcelWorkbook {

 public static void main(String[] args) throws Exception {

  Workbook workbook  = WorkbookFactory.create(new File("file.xlsx"));

  Sheet sheet = workbook.getSheetAt(0);
  Row row = sheet.getRow(0);
  if (row == null) row = sheet.createRow(0);
  Cell cell = row.getCell(0);
  if (cell == null) cell = row.createCell(0);
  cell.setCellValue("changed");

  workbook.write(new FileOutputStream("fileNew.xlsx"));
  workbook.close();

 }
}

After this code both files fileNew.xlsxas well as file.xlsx are changed.

在此代码之后,文件newNew.xlsxas和file.xlsx都被更改。

#2


1  

Just stumbled across a potential solution to this issue. I'm no expert, so feel free to suggest alternatives or modifications to my method.

只是偶然发现了这个问题的潜在解决方案。我不是专家,所以请随意建议我的方法的替代或修改。

I also encountered this issue, where the POI documentation advises using a File object rather than a FileInputStream, but fails to mention that the created Workbook cannot then be written to the original file to modify it.

我也遇到过这个问题,POI文档建议使用File对象而不是FileInputStream,但是没有提到创建的Workbook不能写入原始文件来修改它。

However, by creating a temporary copy of the original file using the nio.channels.FileChannel.transferFrom function of the later JDKs (As shown here Standard concise way to copy a file in Java?) I was able to read my data from the duplicated file and then write to the original using the regular workbook.write function.

但是,通过使用后面的JDK的nio.channels.FileChannel.transferFrom函数创建原始文件的临时副本(如此处所示,标准简洁的方式来复制Java中的文件?)我能够从重复的数据中读取我的数据文件然后使用常规的workbook.write函数写入原始文件。

One caveat of this, is that the 'temporary' copy still cannot be deleted whilst it is being accessed. However, it apparently can still have data transferred to it. Once the jvm instance ends, the file can be deleted, so I am treating it like the temporary or backup documents that are sometimes created, for instance when modifying a Word document.

有一点需要注意的是,“临时”副本在访问时仍然无法删除。但是,它显然仍然可以传输数据。一旦jvm实例结束,就可以删除该文件,因此我将其视为有时创建的临时或备份文档,例如在修改Word文档时。


注意!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系我们删除。



 
© 2014-2018 ITdaan.com 粤ICP备14056181号