Le 10/10/2012 10:52, Jens Kapitza a écrit :
Am 10.10.2012 08:14, schrieb Sébastien Dailly:
Hello,

Hello, thank you for your answer


We are encountering some errors with pdfbox on a AIX platform. We
don't know how to reproduce the problem, neither how to explain it…

The stacktrace here :

java.io.IOException: No such file or directory
There is no file would be better if you log the filename from document
at
org.apache.pdfbox.io.RandomAccessFile.length(RandomAccessFile.java:83)
at
org.apache.pdfbox.io.RandomAccessFileOutputStream.<init>(RandomAccessFileOutputStream.java:52)

at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:300)
at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:221)
at
org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:156)
at
org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:105)

at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:262)

at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)

at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225)

at
com.gatetopost.util.pdf.GetPositions.processDocument(GetPositions.java:151)

at com.gatetopost.gestionnaires.[non pdfbox code]

The GetPositions class extends PDFTextStripper and the processDocument
is here :


public void processDocument(final File document, final int numPage,
final File tempDir) throws IOException {

Why do you not check if document exists()?

As always, it is just an extract of the whole code, and the check is maid just before.

Maybe someone really quickly added a document and removed it bevor the
java code could read the file.

Yes, we thought of that. We also have thought that the temp directory is cleaned during the process; but we have found 3month olds files…

final PDDocument doc;
try {
final BufferedInputStream bis = new BufferedInputStream(
new FileInputStream(document), 1024);
try {
final PDFParser parser = new PDFParser(bis);
if ((tempDir != null) && tempDir.isDirectory()) {
parser.setTempDirectory(tempDir);
}
parser.parse();
doc = parser.getPDDocument();
} finally {
try {
bis.close();
} catch (final IOException e) {
LOG.warn("", e);
}
}
} catch (final IOException e) {
throw e;
Error on this line?
} catch (final Exception e) {
LOG.error("Error reading" + document, e);
return;
}

@SuppressWarnings("unchecked")
final List<PDPage> allPages = doc.getDocumentCatalog().getAllPages();

final PDPage page = allPages.get(numPage);
processStream(page, page.findResources(), page.getContents()
.getStream());

}

Can you help me to understand what happen ?
You don't check your inputstream (BufferedInputStream)

I think it is not the cause of the error : the exception is caused in the processStream, so we have already loaded the document catalog, readed the pages…

If the BufferedInputStream would be wrong or null, the Exception would have been raised before.

I could'nt reproduced the problem : if the file does not exists, we have another exception, and do not go throught the pdfbox parsing system…

It is as if Pdfbox can open the file and make its own temp files, but fail when trying to read them… Maybe an environnment problem ? The same code is working fine on other system.

Thanks

Reply via email to