Hi Andi,
> ...
> And the very same could be done for java.util.ArrayList. It should be easy
> enough by following the JavaSet/PythonSet example.
>
> If you send in a patch that implements this, I'd be glad to integrate it !
I've now implemented the PythonList as suggested - sorry for late reply, was
busy with other things and actually didn't need it myself, but think it's
useful anyway for JCC/PyLucene.
I needed to add a PythonListIterator "wrapper" as well and needed to change
the pythonObject reference in PythonIterator from private to
protected long pythonObject;
Hope that doesn't break anything... BTW, are there any tests for the
Java/PythonXXX classes yet? The build did run fine and I was able to
instantiate both JavaSet and JavaList in a Python shell (see example below).
There is one problem/BUG left though: when creating a JavaList instance from
a python list of ints a TypeError occurs. It does work for a list of str
(see below).
The toArray() method (implemented as in the JavaSet class) seems to be the
cause of the problem:
>>> l=range(3)
[0, 1, 2]
>>> pl= collections.JavaList(l)
<JavaList: org.apache.pylucene.util.PythonList@12d96f2>
>>> pl.toArray()
[0, 1, 2]
>>> jl = lucene.ArrayList(pl)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
lucene.JavaError: org.apache.jcc.PythonException: ('while calling',
'toArray', [0, 1, 2])
TypeError: ('while calling', 'toArray', [0, 1, 2]) Java stacktrace:
org.apache.jcc.PythonException: ('while calling', 'toArray', [0, 1, 2])
TypeError: ('while calling', 'toArray', [0, 1, 2])
at org.apache.pylucene.util.PythonList.toArray(Native Method)
at java.util.ArrayList.<init>(ArrayList.java:131)
I guess the ints need to be casted to Objects somehow. Interestingly this is
done in the wrapped Java Classes like HashSet already:
>>> s = set(l)
>>> ps = collections.JavaSet(s)
<JavaSet: org.apache.pylucene.util.PythonSet@af993e>
>>> ps.toArray()
[0, 1, 2]
>>> js = lucene.HashSet(ps)
<HashSet: [0, 1, 2]>
>>> js.toArray()
JArray<object>[<Object: 0>, <Object: 1>, <Object: 2>]
I'm not sure how to fix this and would welcome suggestions.
Is there some helper method for type-safe 'Python2Java casting' that should
be used?
Some further (minor) remarks: I was wondering about "compatibility" with
Java interfaces, i.e.
- do we need to implement this method?
public native int hashCode();
(currently not implemented by PythonList and PythonSet)
- do we need to replace Python Exception with their Java pendant?
e.g. IndexError -> IndexOutOfBoundsException
I've added some comments in the code where this could (should?) be done:
e.g.
# TODO: raise JavaError for IndexOutOfBoundsException!?
# TODO: raise JavaError for NoSuchElementException - if the
iteration has no next element!?
- how to handle/implement methods with same signature (i.e. number of args)
in Python?
e.g.
public native Object remove(int index);
public native boolean remove(Object obj);
In this particular case I've implemented one Python method and did a type
check, not sure if that's optimal or needed at all (does JCC handle this
already?).
TODO:
- resolve TypeError for JavaList when created from python list of ints
- remove comments (or change Exceptions)
- write a test case (note: there was a bug/typo in JavaSet.retainAll() -
fixed)
- merge with trunk
Attached is a patch against
http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_5/
(revision 1292224)
I wouldn't recommend integrating the code until the mentioned bug is
resolved.
Of course I'm willing to finalize this properly but need some help at this
point.
kind regards
Thomas
--
OrbiTeam Software GmbH & Co. KG
53121 Bonn - Germany
http://www.orbiteam.de
--
P.S. And here is an example for those of you who ask "what are they talking
about?"
>>> import lucene
>>> lucene.initVM()
<jcc.JCCEnv object at 0x01390AF0>
>>> l = ['a','b','c']
>>> import collections
>>> pl = collections.JavaList(l)
>>> pl.size()
3
>>> jl = lucene.ArrayList(pl)
>>> jl
<ArrayList: [a, b, c]>
>>>
now we have created an instance of a java.util.ArrayList with a python
"native" list (l) wrapped by the collections.JavaList as constructor
argument
>>> s = set(l)
>>> ps = collections.JavaSet(s)
>>> ps
<JavaSet: org.apache.pylucene.util.PythonSet@160a26f>
>>> ps.size()
3
>>> js = lucene.HashSet(ps)
>>> js
<HashSet: [b, c, a]>
now we have created an instance of a java.utilHashSet with a python "native"
set (s) wrapped by the collections.JavaSet as constructor argument
(BTW, I found it difficult to understand why one class implemented in Java
is called PythonSet whereas the one implemented in Python - and wraps the
Java pendant - is called JavaSet, but that's just a comment and depends on
the point of view)
> -----Ursprüngliche Nachricht-----
> Von: Andi Vajda [mailto:[email protected]]
> Gesendet: Mittwoch, 1. Februar 2012 19:08
> An: [email protected]
> Betreff: Re: AW: Setting Stopword Set in PyLucene (or using Set in
general)
>
>
> Hi Thomas,
>
> On Wed, 1 Feb 2012, Thomas Koch wrote:
>
> > OK, I found a solution (obviously not the best one...): lucene.Set is
> > representing a java.util *interface* Set<E> which of course cannot be
> > instantiated. HashSet is an implementing class, and can be
> > instantiated. You can add elements via the add() method to the set then.
> Example:
> >
> > def get_lucene_set(python_list):
> > """convert python list into lucene.Set (Java.util.set interface)
> > using the HashSet class (java.util) wrapped in lucene.HashSet
> > """
> > hs = lucene.HashSet()
> > for el in python_list:
> > hs.add(el)
> > return hs
> >
> > However I'm still looking for a more elegant constructor that would
> > allow to create a HashSet from a python set (or list). Is that
> available/possible?
>
> In pylucene's python directory, there is a file called collections.py that
has
> what you're looking for, I think.
>
> It's a Python class called JavaSet, that extends a PythonSet class which
is an
> extension point for the java.util.Set interface. PythonSet implements all
the
> java.util.Set methods by calling the corresponding python methods on the
> JavaSet python class. PythonSet itself is defined in
> java/org/apache/pylucene/util/PythonSet.java.
>
> With this pair of classes you have a Python-backed set object being
> integrated with Java via a java.util.Set implementation.
>
> > The same holds for lists like the ArrayList (from java.util too) which
> > implements the Collection interface:
>
> And the very same could be done for java.util.ArrayList. It should be easy
> enough by following the JavaSet/PythonSet example.
>
> If you send in a patch that implements this, I'd be glad to integrate it !
>
> Andi..
>