Hi Andi,

> ...
> And the very same could be done for java.util.ArrayList. It should be easy
> enough by following the JavaSet/PythonSet example.
> 
> If you send in a patch that implements this, I'd be glad to integrate it !

I've now implemented the PythonList as suggested - sorry for late reply, was
busy with other things and actually didn't need it myself, but think it's
useful anyway for JCC/PyLucene. 

I needed to add a PythonListIterator "wrapper" as well and needed to change
the pythonObject reference in PythonIterator from private to
    protected long pythonObject;

Hope that doesn't break anything... BTW, are there any tests for the
Java/PythonXXX classes yet? The build did run fine and I was able to
instantiate both JavaSet and JavaList in a Python shell (see example below).
There is one problem/BUG left though: when creating a JavaList instance from
a python list of ints a TypeError occurs. It does work for a list of str
(see below).

The toArray() method (implemented as in the JavaSet class) seems to be the
cause of the problem:

>>> l=range(3)
 [0, 1, 2]
>>> pl= collections.JavaList(l)
<JavaList: org.apache.pylucene.util.PythonList@12d96f2>
>>> pl.toArray()
[0, 1, 2]
>>> jl = lucene.ArrayList(pl)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
lucene.JavaError: org.apache.jcc.PythonException: ('while calling',
'toArray', [0, 1, 2])
TypeError: ('while calling', 'toArray', [0, 1, 2])  Java stacktrace:
org.apache.jcc.PythonException: ('while calling', 'toArray', [0, 1, 2])
TypeError: ('while calling', 'toArray', [0, 1, 2])
        at org.apache.pylucene.util.PythonList.toArray(Native Method)
        at java.util.ArrayList.<init>(ArrayList.java:131)

I guess the ints need to be casted to Objects somehow. Interestingly this is
done in the wrapped Java Classes like HashSet already:

>>> s = set(l)
>>> ps = collections.JavaSet(s)
<JavaSet: org.apache.pylucene.util.PythonSet@af993e>
>>> ps.toArray()
[0, 1, 2]
>>> js = lucene.HashSet(ps)
<HashSet: [0, 1, 2]>
>>> js.toArray()
JArray<object>[<Object: 0>, <Object: 1>, <Object: 2>]

I'm not sure how to fix this and would welcome suggestions.
Is there some helper method for type-safe 'Python2Java casting' that should
be used?

Some further (minor) remarks: I was wondering about "compatibility" with
Java interfaces, i.e.
 - do we need to implement this method?
    public native int hashCode();
 (currently not implemented by PythonList and PythonSet)

 - do we need to replace Python Exception with their Java pendant?
  e.g. IndexError -> IndexOutOfBoundsException
  I've added some comments in the code where this could (should?) be done:
  e.g.
        # TODO: raise JavaError for IndexOutOfBoundsException!?
        # TODO: raise JavaError for NoSuchElementException - if the
iteration has no next element!?
        
 - how to handle/implement methods with same signature (i.e. number of args)
in Python?
 e.g.
  public native Object remove(int index);
  public native boolean remove(Object obj);
 
In this particular case I've implemented one Python method and did a type
check, not sure if that's optimal or needed at all (does JCC handle this
already?).

TODO: 
- resolve TypeError for JavaList when created from python list of ints
- remove comments (or change Exceptions)
- write a test case (note: there was a bug/typo in JavaSet.retainAll() -
fixed)
- merge with trunk

Attached is a patch against
http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_5/
(revision 1292224)

I wouldn't recommend integrating the code until the mentioned bug is
resolved.
Of course I'm willing to finalize this properly but need some help at this
point.

kind regards

Thomas 
--
OrbiTeam Software GmbH & Co. KG
53121 Bonn - Germany
http://www.orbiteam.de
--
P.S.  And here is an example for those of you who ask "what are they talking
about?"  

>>> import lucene
>>> lucene.initVM()
<jcc.JCCEnv object at 0x01390AF0>
>>> l = ['a','b','c']
>>> import collections
>>> pl = collections.JavaList(l)
>>> pl.size()
3
>>> jl = lucene.ArrayList(pl)
>>> jl
<ArrayList: [a, b, c]>
>>>

now we have created an instance of a java.util.ArrayList with a python
"native" list (l) wrapped by the collections.JavaList as constructor
argument 

>>> s = set(l)
>>> ps = collections.JavaSet(s)
>>> ps
<JavaSet: org.apache.pylucene.util.PythonSet@160a26f>
>>> ps.size()
3
>>> js = lucene.HashSet(ps)
>>> js
<HashSet: [b, c, a]>


now we have created an instance of a java.utilHashSet with a python "native"
set (s) wrapped by the collections.JavaSet as constructor argument 

(BTW, I found it difficult to understand why one class implemented in Java
is called PythonSet whereas the one implemented in Python - and wraps the
Java pendant - is called JavaSet, but that's just a comment and depends on
the point of view)

> -----Ursprüngliche Nachricht-----
> Von: Andi Vajda [mailto:va...@apache.org]
> Gesendet: Mittwoch, 1. Februar 2012 19:08
> An: pylucene-dev@lucene.apache.org
> Betreff: Re: AW: Setting Stopword Set in PyLucene (or using Set in
general)
> 
> 
>   Hi Thomas,
> 
> On Wed, 1 Feb 2012, Thomas Koch wrote:
> 
> > OK, I found a solution (obviously not the best one...): lucene.Set is
> > representing a java.util *interface* Set<E> which of course cannot be
> > instantiated. HashSet is an implementing class, and can be
> > instantiated. You can add elements via the add() method to the set then.
> Example:
> >
> > def get_lucene_set(python_list):
> >    """convert python list into lucene.Set (Java.util.set interface)
> >          using the HashSet class (java.util) wrapped in lucene.HashSet
> >    """
> >    hs = lucene.HashSet()
> >    for el in python_list:
> >        hs.add(el)
> >    return hs
> >
> > However I'm still looking for a more elegant constructor that would
> > allow to create a HashSet from a python set (or list). Is that
> available/possible?
> 
> In pylucene's python directory, there is a file called collections.py that
has
> what you're looking for, I think.
> 
> It's a Python class called JavaSet, that extends a PythonSet class which
is an
> extension point for the java.util.Set interface. PythonSet implements all
the
> java.util.Set methods by calling the corresponding python methods on the
> JavaSet python class. PythonSet itself is defined in
> java/org/apache/pylucene/util/PythonSet.java.
> 
> With this pair of classes you have a Python-backed set object being
> integrated with Java via a java.util.Set implementation.
> 
> > The same holds for lists like the ArrayList (from java.util too) which
> > implements the Collection interface:
> 
> And the very same could be done for java.util.ArrayList. It should be easy
> enough by following the JavaSet/PythonSet example.
> 
> If you send in a patch that implements this, I'd be glad to integrate it !
> 
> Andi..
> 


Reply via email to