Andi Vajda wrote > > After both these fixes, I was able to build wrappers for pdfbox: > > > > >>> from pdfbox import * > > >>> initVM(CLASSPATH, vmargs='-Djava.awt.headless=true') > > <jcc.JCCEnv object at 0x295c0> > > >>> > > > > This is all checked into rev 751772. > > > > Please let me know if this works for you, I'd like to get a PyLucene > > 2.4.1 release started now that Java Lucene 2.4.1 has been released. If I > > broke something while doing these non-trivial fixes, now is the time to > > find out.
Thanks Andi! I was able to build a pdfbox wrapper with your changes, too. The changes to setup.py makes it much easier to get the script working. Good work! As a JCC and Java newbie I didn't understand the difference between --jar, --include and --classpath at first. Could you please extend the README in order to explain the three options? Today I've started to play with subclassable Python wrappers. I couldn't get the appended example to work. I run into several issues like "SystemError: NULL result without error in PyObject_Call". Could you have a look, please? The jar with PyPDFTextStripper was wrapped together with the pdfbox jar. public class PyPDFTextStripper extends PDFTextStripper { private PDDocument document; private long pythonObject; public PyPDFTextStripper(String filename) throws IOException { System.out.println( "Loading: " + filename ); document = PDDocument.load(filename); List allPages = document.getDocumentCatalog().getAllPages(); startDocument(document); for( int i=0; i<allPages.size(); i++ ) { PDPage page = (PDPage)allPages.get( i ); System.out.println( "Processing page: " + i ); PDStream contents = page.getContents(); if( contents != null ) { processStream(page, page.findResources(), page.getContents().getStream()); } } } public void pythonExtension(long pythonObject) { this.pythonObject = pythonObject; } public long pythonExtension() { return this.pythonObject; } public void finalize() throws Throwable { pythonDecRef(); } public native void pythonDecRef(); public native void processTextPosition( TextPosition text ); public native void startDocument(PDDocument pdf); public native void startPage(PDPage page); } pdfbox.initVM(classpath=pdfbox.CLASSPATH) class Stripper(pdfbox.PyPDFTextStripper): """ """ def processTextPosition(self, text): print text def startDocument(self, doc): print doc def startArticle(self, isltr): print isltr