Re: Install NumPy in python 2.6

2009-04-22 Thread Ole Streicher
Hi David,

David Cournapeau  writes:
> On Fri, Mar 13, 2009 at 8:20 PM, gopal mishra  wrote:
>> error: Setup script exited with error: None
> Numpy 1.3.0 (to be released 1st April 2009) will contain everything to
> be buildable and usable with python 2.6 on windows. If you are in a
> hurry, you can install numpy from svn (I regularly test on windows
> lately, and I can confirm it does work).

I tried to install numpy and scipy on my Linux machine (64 bit,
openSUSE 11, Python 2.6).

Numpy installed with warnings like "Lapack (...) libraries not found",
but scipy then fails:

error:
Lapack (http://www.netlib.org/lapack/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [lapack]) or by setting
the LAPACK environment variable.

Unfortunately, www.netlib.org is not reachable, so I cannot try to
install lapack.

What is the reason for that?

I tried numpack 1.3.0 and scipack 0.7.0, with easy_install.

Best regards

Ole

--
http://mail.python.org/mailman/listinfo/python-list


Re: Install NumPy in python 2.6

2009-04-22 Thread Ole Streicher
Hi Eduardo,

Eduardo Lenz  writes:
> On Wednesday 22 April 2009 04:47:54 David Cournapeau wrote:
>> On Wed, Apr 22, 2009 at 6:38 PM, Ole Streicher  
> wrote:
>> > but scipy then fails:
>> > error: Lapack (http://www.netlib.org/lapack/) libraries not found.
>> > What is the reason for that?

> try ATLAS instead.

I did:
$ easy_install --prefix=/work/python/ atlas-0.27.0.tar.gz
$ ls -l /work/python/lib/python2.6/site-packages/atlas-0.27.0-py2.6.egg
-rw-r--r-- 1 os gr 98386 2009-04-22 14:05 
/work/python/lib/python2.6/site-packages/atlas-0.27.0-py2.6.egg

For some reason, this is not a directory, like numpy or matplotlib,
but I guess that is OK?

Anyway: it did not help. The error stays the same. What could be the
problem?


  "easy install" is something different than python's easy_install.


Best regards.

Ole
--
http://mail.python.org/mailman/listinfo/python-list


Large data arrays?

2009-04-23 Thread Ole Streicher
Hi,

for my application, I need to use quite large data arrays 
(100.000 x 4000 values) with floating point numbers where I need a fast
row-wise and column-wise access (main case: return a column with the sum 
over a number of selected rows, and vice versa).

I would use the numpy array for that, but they seem to be
memory-resistent. So, one of these arrays would use about 1.6 GB
memory which far too much. So I was thinking about a memory mapped
file for that. As far as I understand, there is one in numpy.

For this, I have two questions: 

1. Are the "numpy.memmap" array unlimited in size (resp. only limited
by the maximal file size)? And are they part of the system's memory
limit (~3GB for 32bit systems)?

2. Since I need row-wise as well as column-wise access, a simple usage
of a big array as memory mapped file will probably lead to a very poor
performance, since one of them would need to read values splattered
around the whole file. Are there any "plug and play" solutions for
that? If not: what would be the best way to solve this problem? 
Probably, one needs to use someting like the "Morton layout" for the
data. Would one then build a subclass of memmap (or ndarray?) that
implements this specific layout? How would one do that? (Sorry, I am
still a beginner with respect to python).

Best regards

Ole
--
http://mail.python.org/mailman/listinfo/python-list


Re: Large data arrays?

2009-04-23 Thread Ole Streicher
Hi Nick,

Nick Craig-Wood  writes:
> mmaps come out of your applications memory space, so out of that 3 GB
> limit.  You don't need that much RAM of course but it does use up
> address space.

Hmm. So I have no chance to use >= 2 of these arrays simultaniously?

> Sorry don't know very much about numpy, but it occurs to me that you
> could have two copies of your mmapped array, one the transpose of the
> other which would then speed up the two access patterns enormously.

That would be a solution, but it takes twice the amount of address
space (which seems already to be the limiting factor). In my case (1.6
GB per array), I could even not use one array. 

Also, I would need to fill two large files at program start: one for
each orientation (row-wise or column-wise). Depending on the input
data (which are also either row-wise or column-wise), the filling of
the array with opposite direction would take a lot of time because of
the inefficiencies.

For that, using both directions probably would be not a good
solution. What I found is the "Morton layout" which uses a kind of
fractal interleaving and sound not that complicated. But I have no
idea on how to turn it into a "numpy" style: can I just extend from
numpy.ndarray (or numpy.memmap), and which functions/methods then need
to be overwritten? The best would be ofcourse that someone already did
this before that I could use without trapping in all these pitfalls
which occur when one implements a very generic algorithm.

Best regards

Ole
--
http://mail.python.org/mailman/listinfo/python-list


Re: Large data arrays?

2009-04-24 Thread Ole Streicher
Hi John,

John Machin  writes:
> The Morton layout wastes space if the matrix is not square. Your 100K
> x 4K is very non-square. Looks like you might want to use e.g. 25
> Morton arrays, each 4K x 4K.

What I found was that Morton layout shall be usable, if the shape is
rectangular and both dimensions are powers of two. But, all examples
were done with equal dimensions, so I am a bit confused here.

>From my access pattern, it would be probably better to combine 25 rows
into one slice and have one matrix where every cell contains 25 rows.

Are there any objections about that?

Best regards 

Ole
--
http://mail.python.org/mailman/listinfo/python-list


Re: Large data arrays?

2009-04-24 Thread Ole Streicher
Hi Nick,

Nick Craig-Wood  writes:
> I'd start by writing a function which took (x, y) in array
> co-ordinates and transformed that into (z) remapped in the Morton
> layout.

This removes the possibility to use the sum() and similar methods of
numpy. Implementing them myself is probably much worse than using
Numpys own.

> Alternatively you could install a 64bit OS on your machine and use
> my scheme!

Well: I am just the developer. Ofcourse I could just raise the
requirements to use my software, but I think it is good style to keep
them as low as possible. 

Best regards

Ole

--
http://mail.python.org/mailman/listinfo/python-list


Superclass initialization

2009-04-24 Thread Ole Streicher
Hi again,

I am trying to initialize a class inherited from numpy.ndarray:

from numpy import ndarray

class da(ndarray):
def __init__(self, mydata):
ndarray.__init__(self, 0)
self.mydata = mydata

When I now call the constructor of da:
da(range(100))

I get the message:

ValueError: sequence too large; must be smaller than 32

which I do not understand. This message is generated by the
constructor of ndarray, but the ndarray constructor
(ndarray.__init__()) has only "0" as argument, and calling
"ndarray(0)" directly works perfect.

In the manual I found that the constructor of a superclass is not
called implicitely, so there should be no other call to
ndarray.__init__() the the one in my __init__ method.

I am now confused on where does the call to ndarray come from. How do
I correct that?

Best regards

Ole
--
http://mail.python.org/mailman/listinfo/python-list


Re: Superclass initialization

2009-04-24 Thread Ole Streicher
Steven D'Aprano  writes:
> Perhaps you should post the full trace back instead of just the final 
> line.

No Problem, although I dont see the information increase there:

In [318]: class da(ndarray):
   .: def __init__(self, mydata):
   .: ndarray.__init__(self, 0)
   .: self.mydata = mydata
   .:

In [319]: da(range(100))
---
ValueErrorTraceback (most recent call last)

/m3d/src/python/ in ()

ValueError: sequence too large; must be smaller than 32

The same happens if I put the class definition into a file: the
traceback does *not* point to a code line in that source file but to
the input line. Again, full trace:

In [320]: import da

In [321]: da.da(range(100))
---
ValueErrorTraceback (most recent call last)

/m3d/src/python/ in ()

ValueError: sequence too large; must be smaller than 32

(using python instead of ipython also does not give more details).

Best regards

Ole
--
http://mail.python.org/mailman/listinfo/python-list


Re: Superclass initialization

2009-04-24 Thread Ole Streicher
Arnaud Delobelle  writes:
> numpy.ndarray has a __new__ method (and no __init__).  I guess this is
> the one you should override.  Try:

What is the difference?

best regards

Ole
--
http://mail.python.org/mailman/listinfo/python-list


Re: Large data arrays?

2009-04-24 Thread Ole Streicher
Hi John,

John Machin  writes:
>> From my access pattern, it would be probably better to combine 25 rows
>> into one slice and have one matrix where every cell contains 25 rows.
>> Are there any objections about that?
> Can't object, because I'm not sure what you mean ... how many elements
> in a "cell"?

Well, a matrix consists of "cells"? A 10x10 matrix has 100 "cells".

Regards 

Ole
--
http://mail.python.org/mailman/listinfo/python-list


Re: Large data arrays?

2009-04-24 Thread Ole Streicher
Hi John

John Machin  writes:
> On Apr 25, 1:14 am, Ole Streicher  wrote:
>> John Machin  writes:
>> >> From my access pattern, it would be probably better to combine 25 rows
>> >> into one slice and have one matrix where every cell contains 25 rows.
>> >> Are there any objections about that?
>> > Can't object, because I'm not sure what you mean ... how many elements
>> > in a "cell"?
>>
>> Well, a matrix consists of "cells"? A 10x10 matrix has 100 "cells".
>
> Yes yes but you said "every cell contains 25 rows" ... what's in a
> cell? 25 rows, with each row containing what?

I mean: original cells.
I have 100.000x4096 entries:

(0,0) (0,1) ... (0,4095)
(1,0) (1,1) ... (1,4095)
...
(100.000,0) (100.000,1) ... (100.000,4095)

This will be re-organized in a new matrix, containing 4096 columns (as
before) and 4000 rows. The leftmost cell (first row, first col) in the
new matrix then contains the array

(0,0)
(1,0)
...
(24,0)

The second column of the first row contains the array

(0,1)
(1,1)
...
(24,1)

and so on. The first column of the second row contains

(25,0)
...
(49,0)

That way, I get a new matrix where every cell contains an array of 24
"original" cells. Disadvantage (what I see now when I write it down)
is that this is bad for numpy since it deals with arrays instead of
numbers in matrix positions.

Best regards

Ole
--
http://mail.python.org/mailman/listinfo/python-list


epydoc xml output?

2009-09-28 Thread Ole Streicher
Hi,

I am using epydoc for my code documentation and I am curious whether
there exist a possibility to produce the output in xml format.

Reason for that is that I want to convert it to WordML and get it into
our private documentation system.

Unfortunately, the documentation does not mention xml, but says that
epydoc is modular. In the epydoc.docwriter.html documentation, there an
example 


  $book.title$
  $book.count_pages()$
...

is mentioned that looks like xml -- but I could get more information
about that.

So, what is the best was to get some structured xml from an
reStructuredText/epydoc formatted API documentation?

Best regards

Ole

-- 
http://mail.python.org/mailman/listinfo/python-list


weak reference to bound method

2009-10-02 Thread Ole Streicher
Hi group,

I am trying to use a weak reference to a bound method:

class MyClass(object):
def myfunc(self):
pass

o = MyClass()
print o.myfunc
   >

import weakref
r = weakref.ref(o.myfunc)
print r()
   None

This is what I do not understand. The object "o" is still alive, and
therefore the bound method "o.myfunc" shall exists.

Why does the weak reference claim that it is removed? And how can I hold
the reference to the method until the object is removed?

Is this a bug or a feature? (Python 2.6)

Best regards

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weak reference to bound method

2009-10-02 Thread Ole Streicher
Hi Thomas,

Thomas Lehmann  writes:
>> r = weakref.ref(o.myfunc)
>> print r()
>>    None
> k = o.myfunc
> r = weakref.ref(k)
> print r()
>>  

> Don't ask me why! I have just been interested for what you are trying...

This is clear: in your case, o.myfunc is explicitely referenced by k,
this avoids the garbage collection.

My problem is that I have a class that delegates a function call, like:

8<--
import weakref

class WeakDelegator(object):
def __init__(self, func):
self._func = weakref.ref(func)

def __call__(self):
func = self._func()
return func() if func else None
8<--

This does not work for bound methods because the weak reference to a
bound method will always point to None, even if the object still exists.

Why is that the case and how can I implement such a class properly?

Best regards

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weak reference to bound method

2009-10-02 Thread Ole Streicher
Hello Peter,

Peter Otten <__pete...@web.de> writes:
> Is there an actual use case?

I discussed this in the german newsgroup. Here is the use in my class:
-8<---
import threading
import weakref

class DoAsync(threading.Thread):
def __init__(self, func):
threading.Thread.__init__(self)
self.setDaemon(True)
self._cond = threading.Condition()
self.scheduled = False
self._func = weakref.ref(func, self._cleanup)
self.start()

def run(self):
while self._func():
with self._cond:
while not self.scheduled and self._func():
self._cond.wait()
self.scheduled = False
func = self._func()
if func:
func()

def __call__(self):
with self._cond:
self.scheduled = True
self._cond.notify()

def _cleanup(self, ref):
self()
-8<---

The use for this callable class is to take a function call, and whenever
the DoAsync object is called, trigger a call to the stored function.

Other classes use it like:

-8<---
class MyClass:
def __init__(self):
 ...
 self.update = DoAsync(self._do_update)

def _do_update(self):
 do_something_that_takes_long_and_shall_be_done_after_an_update()
-8<---

Since DoAsync starts its own thread, I get a classical deadlock
situation: DoAsync needs a reference to the method to be called, and as
long as the thread is running, the MyClass object (which contains the
method) cannot be cleaned up. This would be a classic case for a weak
reference, if Python would not create it at calling time.

> No. o.myfunc is a different object, a bound method, and every time you 
> access o's myfunc attribute a new bound method is created:

What is the reason for that behaviour? It looks quite silly to me.

And how can I get a reference to a bound method that lives as long as
the method itself?

Regards

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weak reference to bound method

2009-10-02 Thread Ole Streicher
Hi Miles,

Miles Kaufmann  writes:
> You could also create a wrapper object that holds a weak reference to  the
> instance and creates a bound method on demand:
> class WeakMethod(object):
> def __init__(self, bound_method):
> self.im_func = bound_method.im_func
> self.im_self = weakref.ref(bound_method.im_self)
> self.im_class = bound_method.im_class

In this case, I can use it only for bound methods, so I would need to
handle the case of unbound methods separately.

Is there a way to find out whether a function is bound? Or do I have to
use hasattr(im_func) and hasattr(im_self) and hasattr(im_class)?

Best regards

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weak reference to bound method

2009-10-02 Thread Ole Streicher
Hello Peter,

Peter Otten <__pete...@web.de> writes:
>> What I want is to have a universal class that "always" works: with
>> unbound functions, with bound function, with lambda expressions, with
>> locally defined functions,

> That's left as an exercise to the reader ;)

Do you have the feeling that there exists any reader that is able to
solve this exercise? :-)

I am a bit surprised that already such a simple problem is virtually
unsolvable in python. Do you think that my concept of having a DoAsync
class is wrong?

Best regards

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weak reference to bound method

2009-10-02 Thread Ole Streicher
Hi Peter,

Peter Otten <__pete...@web.de> writes:
> Ole Streicher wrote:
>> Peter Otten <__pete...@web.de> writes:
>>>> What I want is to have a universal class that "always" works: with
>>>> unbound functions, with bound function, with lambda expressions, with
>>>> locally defined functions,
>>> That's left as an exercise to the reader ;)
>> Do you have the feeling that there exists any reader that is able to
>> solve this exercise? :-)
> I was thinking of you.

I could imagine that. However, I am just a beginner in Python and I dont
know which types "callables" there exist in python and which "smart"
ideas (like re-creating them at every call) additionally occur when I
implement such a beast. For example, for locally defined functions, I
have still no idea at all on how to keep them away from the garbage
collector.

>> I am a bit surprised that already such a simple problem is virtually
>> unsolvable in python. Do you think that my concept of having a DoAsync
>> class is wrong?
> I don't understand the example you give in the other post. 

Hmm. I am programming a GUI client application. The client will receive
some input data (via network, and via user input) and shall be updated
after these data.

Unfortunately, the input data (and ofcourse the user input) do not come
regularly; there may times when the data come too fast to process all
of them.

Imagine, for example, that I want to provide a 2d-gaussian fit to some
region of an image and display the result in a separate window, and
updated this whenever the mouse is moved.

The fit takes (let's say) some seconds, so I cannot just call the fit
routine within the mouse move event (this would block other gui
operations, and f.e. the display of the mouse coordinates). So I need
just to trigger the fit routine on mouse movement, and to check
afterwards whether the mouse position is still current.

This is the reason for the DoAsync class: when it is called, it shall
trigger the function that was given in the constructor, t.m.

class MyClass:
def __init__(self):
self.update_fit = DoAsync(update_the_fit)

def mouse_move(self, event):
self.set_coordinates(event.x, event.y)
self.update_fit() # trigger the update_the_fit() call
...

Thus, the mouse_move() method is fast, even if the update_the_fit()
method takes some time to process.

I want to implement it now that DoAsync will be automatically garbage
collected whenever the MyClass object is deleted. Since DoAsync starts
its own thread (which only shall finish when the MyClass object is
deleted), a reference to MyClass (or one of its functions) will keep the
MyClass object from garbage collection.

> If you are trying to use reference counting as a means of inter-thread 
> communication, then yes, I think that's a bad idea.

No; my problem is:

- a thread started in DoAsync will keep the DoAsync object from
  garbage collection
- a reference to a MyClass realted object (the bound method) in DoAsync 
  will thus also prevent the MyClass object from garbage collection
- Even if I dont use the MyClass object anymore, and nobody else uses 
  the DoAsync object, both stand in memory forever, and the thread also 
  never finishes.

Did you get the problem?

Best regards

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weak reference to bound method

2009-10-02 Thread Ole Streicher
Hi Peter,

Peter Otten <__pete...@web.de> writes:
> class Method(object):
> def __init__(self, obj, func=None):
> if func is None:
> func = obj.im_func
> obj = obj.im_self

This requires that func is a bound method. What I want is to have a
universal class that "always" works: with unbound functions, with
bound function, with lambda expressions, with locally defined functions,
...

For a user of my class, there is no visible reason, why some of them
shall work, while others dont.

Viele Grüße

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weak reference to bound method

2009-10-02 Thread Ole Streicher
Hi Peter,

Peter Otten <__pete...@web.de> writes:
 I am a bit surprised that already such a simple problem is virtually
 unsolvable in python. 

> Btw, have you implemented such a design in another language?

No. 

> I think I'd go for a simpler approach, manage the lifetime of MyClass 
> instances manually and add a MyClass.close() method that sets a flag which 
> in turn is periodically read by DoAsync() and will eventually make it stop.

Has the disadvantage that I rely on the user. We already have a garbage
collector, so why not use it? It is exactly made for what I want: delete
unused objects.

The solution I have now is still untested, but maybe the separation of
the thread will work:
--<8
import threading
import weakref

class AsyncThread(threading.Thread):
def __init__(self, func):
threading.Thread.__init__(self)
self.setDaemon(True)
self._cond = threading.Condition()
self._func = weakref.ref(func, self.run)
self.start()

def run(self):
while self._func():
with self._cond:
while not self.scheduled:
self._cond.wait()
self.scheduled = False
self._func()
func = self._func()
if func:
func()

def __call__(self):
with self._cond:
self.scheduled = True
self._cond.notify()

class DoAsync(object):
def __init__(self, func):
self._func = func
self._thread = AsyncThread(self._func)

def __call__(self):
self._thread()
--<8

The AsyncThread has now no reference that could prevent the referenced
from the GC. And the function is always stored in the DoAsync object and
will be only collected if this is removed (what is the case if the
parent object is being deleted)

Regards

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threaded GUI slowing method execution?

2009-10-02 Thread Ole Streicher
sturlamolden  writes:
> On 2 Okt, 13:29, Dave Angel  wrote:
> If you are worried about speed, chances are you are not using Python
> anyway.

I *do* worry about speed. And I use Python. Why not? There are powerful
libraries available.

> If you still have "need for speed" on a multicore, you can use Cython
> and release the GIL when appropriate. Then launch multiple Python
> threads and be happy.

Usually this is not an option: numpy is AFAIK not available for Cython,
neither is scipy (ofcourse).

Especially for numeric calculations, speed *matters*.

> Using more than one process is always an option, i.e. os.fork if you
> have it or multiprocessing if you don't. Processes don't share GIL.

Not if the threads/processes need to share lots of data. Interprocess
communication can be very expensive -- even more if one needs to share
Python objects. Also, the support of sharing python objects between
processes seems to me not well supported at least by the standard python
libs.

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list


() vs. [] operator

2009-10-15 Thread Ole Streicher
Hi,

I am curious when one should implement a "__call__()" and when a
"__getitem__()" method. 

For example, I want to display functions and data in the same plot. For
a function, the natural interface would to be called as "f(x)", while
the natural interface for data would be "f[x]". On the other hand,
whether a certain data object is a function or a data table is just an
inner detail of the object (imagine f.e. a complex function that
contains a data table as cache), and there is no reason to distinguish
them by interface.

So what is the reason that Python has separate __call__()/() and
__getitem__()/[] interfaces and what is the rule to choose between them?

Regards

Ole
-- 
http://mail.python.org/mailman/listinfo/python-list