Hi Andi, > ... > And the very same could be done for java.util.ArrayList. It should be easy > enough by following the JavaSet/PythonSet example. > > If you send in a patch that implements this, I'd be glad to integrate it !
I've now implemented the PythonList as suggested - sorry for late reply, was busy with other things and actually didn't need it myself, but think it's useful anyway for JCC/PyLucene. I needed to add a PythonListIterator "wrapper" as well and needed to change the pythonObject reference in PythonIterator from private to protected long pythonObject; Hope that doesn't break anything... BTW, are there any tests for the Java/PythonXXX classes yet? The build did run fine and I was able to instantiate both JavaSet and JavaList in a Python shell (see example below). There is one problem/BUG left though: when creating a JavaList instance from a python list of ints a TypeError occurs. It does work for a list of str (see below). The toArray() method (implemented as in the JavaSet class) seems to be the cause of the problem: >>> l=range(3) [0, 1, 2] >>> pl= collections.JavaList(l) <JavaList: org.apache.pylucene.util.PythonList@12d96f2> >>> pl.toArray() [0, 1, 2] >>> jl = lucene.ArrayList(pl) Traceback (most recent call last): File "<stdin>", line 1, in <module> lucene.JavaError: org.apache.jcc.PythonException: ('while calling', 'toArray', [0, 1, 2]) TypeError: ('while calling', 'toArray', [0, 1, 2]) Java stacktrace: org.apache.jcc.PythonException: ('while calling', 'toArray', [0, 1, 2]) TypeError: ('while calling', 'toArray', [0, 1, 2]) at org.apache.pylucene.util.PythonList.toArray(Native Method) at java.util.ArrayList.<init>(ArrayList.java:131) I guess the ints need to be casted to Objects somehow. Interestingly this is done in the wrapped Java Classes like HashSet already: >>> s = set(l) >>> ps = collections.JavaSet(s) <JavaSet: org.apache.pylucene.util.PythonSet@af993e> >>> ps.toArray() [0, 1, 2] >>> js = lucene.HashSet(ps) <HashSet: [0, 1, 2]> >>> js.toArray() JArray<object>[<Object: 0>, <Object: 1>, <Object: 2>] I'm not sure how to fix this and would welcome suggestions. Is there some helper method for type-safe 'Python2Java casting' that should be used? Some further (minor) remarks: I was wondering about "compatibility" with Java interfaces, i.e. - do we need to implement this method? public native int hashCode(); (currently not implemented by PythonList and PythonSet) - do we need to replace Python Exception with their Java pendant? e.g. IndexError -> IndexOutOfBoundsException I've added some comments in the code where this could (should?) be done: e.g. # TODO: raise JavaError for IndexOutOfBoundsException!? # TODO: raise JavaError for NoSuchElementException - if the iteration has no next element!? - how to handle/implement methods with same signature (i.e. number of args) in Python? e.g. public native Object remove(int index); public native boolean remove(Object obj); In this particular case I've implemented one Python method and did a type check, not sure if that's optimal or needed at all (does JCC handle this already?). TODO: - resolve TypeError for JavaList when created from python list of ints - remove comments (or change Exceptions) - write a test case (note: there was a bug/typo in JavaSet.retainAll() - fixed) - merge with trunk Attached is a patch against http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_5/ (revision 1292224) I wouldn't recommend integrating the code until the mentioned bug is resolved. Of course I'm willing to finalize this properly but need some help at this point. kind regards Thomas -- OrbiTeam Software GmbH & Co. KG 53121 Bonn - Germany http://www.orbiteam.de -- P.S. And here is an example for those of you who ask "what are they talking about?" >>> import lucene >>> lucene.initVM() <jcc.JCCEnv object at 0x01390AF0> >>> l = ['a','b','c'] >>> import collections >>> pl = collections.JavaList(l) >>> pl.size() 3 >>> jl = lucene.ArrayList(pl) >>> jl <ArrayList: [a, b, c]> >>> now we have created an instance of a java.util.ArrayList with a python "native" list (l) wrapped by the collections.JavaList as constructor argument >>> s = set(l) >>> ps = collections.JavaSet(s) >>> ps <JavaSet: org.apache.pylucene.util.PythonSet@160a26f> >>> ps.size() 3 >>> js = lucene.HashSet(ps) >>> js <HashSet: [b, c, a]> now we have created an instance of a java.utilHashSet with a python "native" set (s) wrapped by the collections.JavaSet as constructor argument (BTW, I found it difficult to understand why one class implemented in Java is called PythonSet whereas the one implemented in Python - and wraps the Java pendant - is called JavaSet, but that's just a comment and depends on the point of view) > -----Ursprüngliche Nachricht----- > Von: Andi Vajda [mailto:va...@apache.org] > Gesendet: Mittwoch, 1. Februar 2012 19:08 > An: pylucene-dev@lucene.apache.org > Betreff: Re: AW: Setting Stopword Set in PyLucene (or using Set in general) > > > Hi Thomas, > > On Wed, 1 Feb 2012, Thomas Koch wrote: > > > OK, I found a solution (obviously not the best one...): lucene.Set is > > representing a java.util *interface* Set<E> which of course cannot be > > instantiated. HashSet is an implementing class, and can be > > instantiated. You can add elements via the add() method to the set then. > Example: > > > > def get_lucene_set(python_list): > > """convert python list into lucene.Set (Java.util.set interface) > > using the HashSet class (java.util) wrapped in lucene.HashSet > > """ > > hs = lucene.HashSet() > > for el in python_list: > > hs.add(el) > > return hs > > > > However I'm still looking for a more elegant constructor that would > > allow to create a HashSet from a python set (or list). Is that > available/possible? > > In pylucene's python directory, there is a file called collections.py that has > what you're looking for, I think. > > It's a Python class called JavaSet, that extends a PythonSet class which is an > extension point for the java.util.Set interface. PythonSet implements all the > java.util.Set methods by calling the corresponding python methods on the > JavaSet python class. PythonSet itself is defined in > java/org/apache/pylucene/util/PythonSet.java. > > With this pair of classes you have a Python-backed set object being > integrated with Java via a java.util.Set implementation. > > > The same holds for lists like the ArrayList (from java.util too) which > > implements the Collection interface: > > And the very same could be done for java.util.ArrayList. It should be easy > enough by following the JavaSet/PythonSet example. > > If you send in a patch that implements this, I'd be glad to integrate it ! > > Andi.. >