On 28 Mar 2017, at 6:11, INADA Naoki wrote:
I managed to install pyopencl and run the script. It takes more than
2 hours, and uses only 7GB RAM.
Maybe, some faster backend for OpenCL is required?
I used Microsoft Azure Compute, Standard_A4m_v2 (4 cores, 32 GB
memory) instance.
I suppose that the computing power of the Azure instance might not be
sufficient and it takes much longer to get to the phase where the memory
requirements increase? Have you access to the output that was produced?
By the way, this has nothing to do with OpenCL. OpenCL isn't used by the
log_reduction.py script at all. It is listed in the dependencies because
some other things use it.
More easy way to reproduce is needed...
Yes, I agree, but it's not super easy (all the smaller existing examples
don't exhibit the problem so far), but I'll see what I can do.
My best idea about what's going on at the moment is that memory
fragmentation is worse in Python 3.6 for some reason. The virtual
memory
size indicates that a large address space is acquired, but the
resident
memory size is smaller indicating that not all of that address space
is
actually used. In fact, the code might be especially bad to
fragmentation
because it takes a lot of small NumPy arrays and concatenates them
into
larger arrays. But I'm still surprised that this is only a problem
with
Python 3.6 (if this hypothesis is correct).
Jan
Generally speaking, VMM vs RSS doesn't mean fragmentation.
If RSS : total allocated memory ratio is bigger than 1.5, it may be
fragmentation.
And large VMM won't cause swap. Only RSS is meaningful.
I suppose you are right that from the VMM and RSS numbers one cannot
deduce fragmentation. But I think RSS in this case might not be
meaningful either. My understanding from [the Wikipedia description] is
that it doesn't account for parts of the memory that have been written
to the swap. Or in other words RSS will never exceed the size of the
physical RAM. VSS is also only partially useful because it just gives
the size of the address space of which not all might be used?
Anyways, I'm getting a swap usage of about 30GB with Python 3.6 and
zsh's time reports 2339977 page faults from disk vs. 107 for Python 3.5.
I have some code to measure the unique set size (USS) and will see what
numbers I get with that.
Jan
--
https://mail.python.org/mailman/listinfo/python-list