Hi Thomas,
On Fri, 2 Mar 2012, Thomas Koch wrote:
thanks for the feedback! I revised the code and send you attached a new
patch.
Sorry for the delay in getting back to you.
I integrated your patch and fixed a bunch of formatting and bugs in it.
The collections-demo.py is not fully functional yet so I attach it here too,
somewhat fixed up as well.
There is a bug somewhere with constructing an ArrayList from a python
collection like JavaSet or JavaList. At some point, toArray() gets called,
the right aray is returned (almost, see below) but the ArrayList looks like
built from an array of empty objects.
I also attach a short demo script that shows the problems I mentioned
earlier when trying to initialize an ArrayList with a JavaSet (or JavaList)
containing integers.
For that the toArray() methods in collections.py must create use the correct
array type using int, float, etc... instead of object based on what's in the
python object.
Alternatively, they need these methods need to box the int values by
wrapping them into a Java Integer object (for example, lucene.Integer(5)).
I leave that to you to continue with, I'm out of time for right now :-)
Finally I'd suggest to rename collections.py because there's one defined on
Python lib already:
http://docs.python.org/library/collections.html
Until this happens, you can use:
from lucene import collections
as the collections.py file gets installed in the lucene package.
Throwing Java exceptions from Python is done by raising JavaError with the
desired Java exception object (I added a few to the jcc call in PyLucene's
Makefile), for example:
raise JavaError, NoSuchElementException(str(index))
It's been like that for a very long time, I just forgot.
This is implemented by throwPythonError() in jcc's functions.cpp: if the
error is JavaError, then the Java exception instance used as argument to it
is raised to the JVM.
I attached the not-checked-in diffs as patches. The new Makefile is checked
into the pylucene-3.x branch.
Below are some comments to your comments...
More responses inline below.
Ok, I was unsure on how to properly throw a Java Exception in Python code -
and couldn't find an example.
Also I thought a Java Exception type should be exported in lucene - this is
not the case however:
lucene.NoSuchElementException
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'NoSuchElementException'
I imagine I could
- add the java.util.NoSuchElementException to the Makefile to get it
generated by JCC and throw it via raise?
- use lucene.JavaError and pass 'java.util.NoSuchElementException' name in
the constructor?
Yes, you guessed it right, this is how it works as outlined above.
You had various bugs in next()/nextIndex(), previous()/previousIndex() that
I hopefully fixed. Also, listIterator() can't be overridden in Python, I
fixed it in PythonList and in collections.py.
Andi..
Index: java/org/apache/pylucene/util/PythonListIterator.java
===================================================================
--- java/org/apache/pylucene/util/PythonListIterator.java (revision 0)
+++ java/org/apache/pylucene/util/PythonListIterator.java (revision 0)
@@ -0,0 +1,30 @@
+/* ====================================================================
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ * ====================================================================
+ */
+
+package org.apache.pylucene.util;
+
+import java.util.ListIterator;
+
+public class PythonListIterator extends PythonIterator implements ListIterator
{
+ public native boolean hasPrevious();
+ public native Object previous();
+
+ public native int nextIndex();
+ public native int previousIndex();
+
+ public native void set(Object obj);
+ public native void add(Object obj);
+ public native void remove();
+}
Property changes on: java/org/apache/pylucene/util/PythonListIterator.java
___________________________________________________________________
Added: svn:mime-type
+ text/plain
Added: svn:eol-style
+ native
Index: java/org/apache/pylucene/util/PythonSet.java
===================================================================
--- java/org/apache/pylucene/util/PythonSet.java (revision 1220345)
+++ java/org/apache/pylucene/util/PythonSet.java (working copy)
@@ -62,14 +62,6 @@
public Object[] toArray(Object[] a)
{
- Object[] array = toArray();
-
- if (a.length < array.length)
- a = (Object[]) Array.newInstance(a.getClass().getComponentType(),
- array.length);
-
- System.arraycopy(array, 0, a, 0, array.length);
-
- return a;
+ return toArray();
}
}
Index: java/org/apache/pylucene/util/PythonList.java
===================================================================
--- java/org/apache/pylucene/util/PythonList.java (revision 0)
+++ java/org/apache/pylucene/util/PythonList.java (revision 0)
@@ -0,0 +1,107 @@
+/* ====================================================================
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ * ====================================================================
+ */
+
+package org.apache.pylucene.util;
+
+import java.util.List;
+import java.util.ListIterator;
+import java.util.Collection;
+import java.util.Iterator;
+import java.lang.reflect.Array;
+
+public class PythonList implements List {
+
+ private long pythonObject;
+
+ public PythonList()
+ {
+ }
+
+ public void pythonExtension(long pythonObject)
+ {
+ this.pythonObject = pythonObject;
+ }
+ public long pythonExtension()
+ {
+ return this.pythonObject;
+ }
+
+ public void finalize()
+ throws Throwable
+ {
+ pythonDecRef();
+ }
+
+ public native void pythonDecRef();
+
+ public native boolean add(Object obj);
+ public native void add(int index, Object obj);
+ public native boolean addAll(Collection c);
+ public native boolean addAll(int index, Collection c);
+ public native void clear();
+ public native boolean contains(Object obj);
+ public native boolean containsAll(Collection c);
+ public native boolean equals(Object obj);
+ public native Object get(int index);
+ // public native int hashCode();
+ public native int indexOf(Object obj);
+ public native boolean isEmpty();
+ public native Iterator iterator();
+ public native int lastIndexOf(Object obj);
+
+ public native ListIterator listIterator(int index);
+ public ListIterator listIterator()
+ {
+ return listIterator(0);
+ }
+
+ private native Object removeAt(int index);
+ public Object remove(int index)
+ throws IndexOutOfBoundsException
+ {
+ if (index < 0 || index >= this.size())
+ throw new IndexOutOfBoundsException();
+
+ return removeAt(index);
+ }
+
+ private native boolean removeObject(Object obj);
+ public boolean remove(Object obj)
+ {
+ return removeObject(obj);
+ }
+
+ public native boolean removeAll(Collection c);
+ public native boolean retainAll(Collection c);
+ public native Object set(int index, Object obj);
+ public native int size();
+
+ private native List subListChecked(int fromIndex, int toIndex);
+ public List subList(int fromIndex, int toIndex)
+ throws IndexOutOfBoundsException, IllegalArgumentException
+ {
+ if (fromIndex < 0 || toIndex >= size() || fromIndex > toIndex)
+ throw new IndexOutOfBoundsException();
+
+ return subListChecked(fromIndex, toIndex);
+ }
+
+ public native Object[] toArray();
+
+ public Object[] toArray(Object[] a)
+ {
+ return toArray();
+ }
+}
Property changes on: java/org/apache/pylucene/util/PythonList.java
___________________________________________________________________
Added: svn:mime-type
+ text/plain
Added: svn:eol-style
+ native
Index: python/collections.py
===================================================================
--- python/collections.py (revision 1220345)
+++ python/collections.py (working copy)
@@ -10,7 +10,9 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-from lucene import PythonSet, PythonIterator, JavaError
+from lucene import JArray, \
+ PythonSet, PythonList, PythonIterator, PythonListIterator, JavaError, \
+ NoSuchElementException, IllegalStateException, IndexOutOfBoundsException
class JavaSet(PythonSet):
@@ -83,7 +85,7 @@
next = _self._iterator.next()
return next
return _iterator()
-
+
def remove(self, obj):
try:
self._set.remove(obj)
@@ -104,7 +106,7 @@
def retainAll(self, collection):
result = False
for obj in list(self._set):
- if obj not in c:
+ if obj not in collection:
self._set.remove(obj)
result = True
return result
@@ -113,5 +115,239 @@
return len(self._set)
def toArray(self):
- return list(self._set)
+ return JArray(object)(list(self._set))
+
+class JavaListIterator(PythonListIterator):
+ """
+ This class implements java.util.ListIterator for a Python list instance it
+ wraps. (simple bidirectional iterator)
+ """
+ def __init__(self, _lst, index=0):
+ super(JavaListIterator, self).__init__()
+ self._lst = _lst
+ self._lastIndex = -1 # keep state for remove/set
+ self.index = index
+
+ def next(self):
+ if self.index >= len(self._lst):
+ raise JavaError, NoSuchElementException(str(self.index))
+ result = self._lst[self.index]
+ self._lastIndex = self.index
+ self.index += 1
+ return result
+
+ def previous(self):
+ if self.index <= 0:
+ raise JavaError, NoSuchElementException(str(self.index - 1))
+ self.index -= 1
+ self._lastIndex = self.index
+ return self._lst[self.index]
+
+ def hasPrevious(self):
+ return self.index > 0
+
+ def hasNext(self):
+ return self.index < len(self._lst)
+
+ def nextIndex(self):
+ return min(self.index, len(self._lst))
+
+ def previousIndex(self):
+ return max(-1, self.index - 1)
+
+ def add(self, element):
+ """
+ Inserts the specified element into the list.
+ The element is inserted immediately before the next element
+ that would be returned by next, if any, and after the next
+ element that would be returned by previous, if any.
+ """
+ if self._lastIndex < 0:
+ raise JavaError, IllegalStateException("add")
+ self._lst.insert(self.index, element)
+ self.index += 1
+ self._lastIndex = -1 # invalidate state
+
+ def remove(self):
+ """
+ Removes from the list the last element that
+ was returned by next or previous.
+ """
+ if self._lastIndex < 0:
+ raise JavaError, IllegalStateException("remove")
+ del self._lst[self._lastIndex]
+ self._lastIndex = -1 # invalidate state
+
+ def set(self, element):
+ """
+ Replaces the last element returned by next or previous
+ with the specified element.
+ """
+ if self._lastIndex < 0:
+ raise JavaError, IllegalStateException("set")
+ self._lst[self._lastIndex] = element
+
+ def __iter__(self):
+ return self
+
+
+class JavaList(PythonList):
+ """
+ This class implements java.util.List around a Python list instance it
wraps.
+ """
+
+ def __init__(self, _lst):
+ super(JavaList, self).__init__()
+ self._lst = _lst
+
+ def __contains__(self, obj):
+ return obj in self._lst
+
+ def __len__(self):
+ return len(self._lst)
+
+ def __iter__(self):
+ return iter(self._lst)
+
+ def add(self, index, obj):
+ self._lst.insert(index, obj)
+
+ def addAll(self, collection):
+ size = len(self._lst)
+ self._lst.extend(collection)
+ return len(self._lst) > size
+
+ def addAll(self, index, collection):
+ size = len(self._lst)
+ self._lst[index:index] = collection
+ return len(self._lst) > size
+
+ def clear(self):
+ del self._lst[:]
+
+ def contains(self, obj):
+ return obj in self._lst
+
+ def containsAll(self, collection):
+ for obj in collection:
+ if obj not in self._lst:
+ return False
+ return True
+
+ def equals(self, collection):
+ if type(self) is type(collection):
+ return self._lst == collection._lst
+ return False
+
+ def get(self, index):
+ if index < 0 or index >= self.size():
+ raise JavaError, IndexOutOfBoundsException(str(index))
+ return self._lst[index]
+
+ def indexOf(self, obj):
+ try:
+ return self._lst.index(obj)
+ except ValueError:
+ return -1
+
+ def isEmpty(self):
+ return len(self._lst) == 0
+
+ def iterator(self):
+ class _iterator(PythonIterator):
+ def __init__(_self):
+ super(_iterator, _self).__init__()
+ _self._iterator = iter(self._lst)
+ def hasNext(_self):
+ if hasattr(_self, '_next'):
+ return True
+ try:
+ _self._next = _self._iterator.next()
+ return True
+ except StopIteration:
+ return False
+ def next(_self):
+ if hasattr(_self, '_next'):
+ next = _self._next
+ del _self._next
+ else:
+ next = _self._iterator.next()
+ return next
+ return _iterator()
+
+ def lastIndexOf(self, obj):
+ i = len(self._lst)-1
+ while (i>=0):
+ if obj.equals(self._lst[i]):
+ break
+ i -= 1
+ return i
+
+ def listIterator(self, index=0):
+ return JavaListIterator(self._lst, index)
+
+ def remove(self, obj_or_index):
+ if type(obj_or_index) is type(1):
+ return removeAt(int(obj_or_index))
+ return removeElement(obj_or_index)
+
+ def removeAt(self, pos):
+ """
+ Removes the element at the specified position in this list.
+ Note: private method called from Java via remove(int index)
+ index is already checked (or IndexOutOfBoundsException thrown)
+ """
+ try:
+ el = self._lst[pos]
+ del self._lst[pos]
+ return el
+ except IndexError:
+ # should not happen
+ return None
+
+ def removeObject(self, obj):
+ """
+ Removes the first occurrence of the specified object
+ from this list, if it is present
+ """
+ try:
+ self._lst.remove(obj)
+ return True
+ except ValueError:
+ return False
+
+ def removeAll(self, collection):
+ result = False
+ for obj in collection:
+ if self.removeElement(obj):
+ result = True
+ return result
+
+ def retainAll(self, collection):
+ result = False
+ for obj in self._lst:
+ if obj not in collection and self.removeElement(obj):
+ result = True
+ return result
+
+ def size(self):
+ return len(self._lst)
+
+ def toArray(self):
+ return JArray(object)(self._lst)
+
+ def subListChecked(self, fromIndex, toIndex):
+ """
+ Note: private method called from Java via subList()
+ from/to index are already checked (or IndexOutOfBoundsException thrown)
+ also IllegalArgumentException is thronw if the endpoint indices
+ are out of order (fromIndex > toIndex)
+ """
+ sublst = self._lst[fromIndex:toIndex]
+ return JavaList(sublst)
+
+ def set(self, index, obj):
+ if index < 0 or index >= self.size():
+ raise JavaError, IndexOutOfBoundsException(str(index))
+ self._lst[index] = obj
import sys, os
import lucene
from lucene.collections import JavaSet, JavaList
def testStringSet():
s = set(['a', 'b', 'c'])
print "\nSet of Strings: ", s
# create python wrapper for Java Set
# NOTE: this Python class extends/implements the Java class lucene.PythonSet
ps = JavaSet(s)
print "created:", ps, type(ps)
size = ps.size()
print "size: " , size
assert(size == len(s)), "size"
has = ps.contains('b')
print "contains('b'):", has
assert(has is True), "contains"
# create HashSet in JVM
js = lucene.HashSet(ps)
print "created:", js, type(js)
assert(size == js.size()), "size"
assert(js.contains('b') is True), "contains"
ar = js.toArray()
print "toArray:", ar
# create ArrayList in JVM
jl = lucene.ArrayList(ps)
print "created:", jl, type(jl)
assert(size == jl.size()), "size"
sl = jl.subList(1, 3)
print "sublist:", sl
def testStringList():
l = ['a', 'b', 'c']
print "\nList of Strings:", l
# create python wrapper for Java List
# NOTE: this Python class extends/implements the Java class lucene.PythonList
pl = JavaList(l)
print "created:", pl, type(pl)
size = pl.size()
print "size:", size
assert(size == len(l)), "size"
pos = pl.indexOf('b')
print "indexOf('b'):", pos
assert(pos == 1), "indexOf"
# create HashSet in JVM
js = lucene.HashSet(pl)
print "created:", js, type(js)
assert(size == js.size()), "size"
assert(js.contains('b') is True), "contains"
ar = js.toArray()
print "toArray:", ar
# create ArrayList in JVM
jl = lucene.ArrayList(pl)
print "created:", jl, type(jl)
assert(size == jl.size()), "size"
assert(pos == jl.indexOf('b')), "indexOf"
sl = jl.subList(1, 3)
print "sublist:", sl
def testIntSet():
s = set(range(10))
print "\nSet of Integers:", s
ps = JavaSet(s)
print "created:", ps
print "size:", ps.size()
print "contains(2):", ps.contains(2)
# create ArrayList in JVM
# TODO: this results in lucene.JavaError
jl = lucene.ArrayList(ps)
print "created:", jl
def testIntList():
x = range(10)
print "\nList of Integers:", x
pl = JavaList(x)
print "created:", pl
print "size: " , pl.size()
print "indexOf(2):", pl.indexOf(2)
# create ArrayList in JVM
# TODO: this results in lucene.JavaError
jl = lucene.ArrayList(pl)
print "created:", jl
if __name__ == '__main__':
lucene.initVM()
testStringSet()
testStringList()
testIntSet()
testIntList()