Re: array.array()'s memory shared with multiprocessing.Process()

2017-09-10 Thread gerlando . falauto
> 
> I suspect it's down to timing.
> 
> What you're putting into the queue is a reference to the array, and it's 
> only some time later that the array itself is pickled and then sent (the 
> work being done in the 'background').
> 
> Modifying the array before (or while) it's actually being sent would 
> explain the problem you're seeing.

That would also have been my guess. However, according to documentation:

> When an object is put on a queue, the object is pickled and a background 
> thread later flushes the pickled data to an underlying pipe.

In my understanding this means the object is pickled *before* the background 
thread takes care of flushing the data to the pipe. Is that a mistake in the 
documentation then?

Any suggestion for a way to work around this limitation?
Or perhaps a different approach altogether I could use to reduce CPU load?
What the main thread actually does is dequeue data from a high-speed 
USB-to-serial (2.000.000 bps), that's why I came up with the array.array() 
solution to store collected data, hoping for the smallest possible overhead.
Thanks!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: array.array()'s memory shared with multiprocessing.Process()

2017-09-12 Thread gerlando . falauto
Il giorno lunedì 11 settembre 2017 12:19:27 UTC+2, Thomas Jollans ha scritto:
> On 2017-09-10 23:05, iurly wrote:
> > As far as I'm concerned, I'm probably better off using double buffers to 
> > avoid this kind of issues.
> > Thanks a lot for your help!
> > 
> 
> 
> That should work. Some other things to consider, if both processes are
> on the same machine, are a series of memory-mapped files to pass the
> data without pickling, or, if your code is only going to run on Unix,
> something involving shared memory through multiprocessing.Array.

Oh, I see, multiprocessing.Array() sounds like a pretty good idea, thanks!
It's simple enough and actually already compatible with what I'm doing.
That would involve double buffers which I would have to use as the first 
optimization step anyway.

Notice however how I'd have to create those Arrays dynamically in the producer 
thread. Would I then be able to pass them to the consumer by putting a 
reference in a queue? I wouldn't want them to be pickled at all in that case, 
of course.

Also, why do you say it only works on Unix? I couldn't find any reference to 
such limitation in the documentation.

Thank you so much again!
-- 
https://mail.python.org/mailman/listinfo/python-list


logging from time critical tasks -- QueueListener().stop() takes the whole CPU

2018-07-14 Thread Gerlando Falauto
Hi,

I'm adding logging to a time critical task running resource-constrained 
hardware (Raspberry Pi).

I read about the QueueListener/QueueHandler in:
https://docs.python.org/3/howto/logging-cookbook.html#dealing-with-handlers-that-block

and I'm trying to understand how it really works and what's happening under the 
hood, and what the impact would be.
So I wrote a test program (here at the bottom) that logs 50 times and then 
waits for the listener to terminate. I ran it on a PC:

$ python3 logtest.py 0
PID= 5974
[  0.000] Logging like crazy 50 times with que = queue.Queue
[ 18.038] Done logging, waiting for completion
[ 37.824] Test finished
---

Here's the output while it's logging (within the first 18 seconds):

$ ps um -L 5974
USERPIDLWP %CPU NLWP %MEMVSZ   RSS TTY  STAT START   TIME 
COMMAND
iurly  5974  - 96.32  2.1 264232 170892 pts/2   -15:30   0:07 
python3 logtest.py 0
iurly -   5974 72.8--  - - -Rl+  15:30   0:05 -
iurly -   5975 23.3--  - - -Sl+  15:30   0:01 -

So the main thread is taking most of the CPU while the background thread is 
writing to disk, and that's reasonable.
However, as soon as I start waiting for the logger terminate, I get something 
like this:

$ ps um -L 5974
USERPIDLWP %CPU NLWP %MEMVSZ   RSS TTY  STAT START   TIME 
COMMAND
iurly  5974  -  1002  3.9 406724 313588 pts/2   -15:30   0:31 
python3 logtest.py 0
iurly -   5974 45.2--  - - -Sl+  15:30   0:14 -
iurly -   5975 54.8--  - - -Rl+  15:30   0:17 -

Why is the main thread taking up so much CPU?
I believe at this point listener.stop() should only be waiting for the helper 
thread to terminate, which I reckon would be implemented by waiting on a 
semaphore or something (i.e. iowait i.e. 0% CPU).

Thank you,
Gerlando

RANGE=50
import logging
import logging.handlers
import multiprocessing as mp
import queue
import time
import gc
import sys
import os

if len(sys.argv) > 1:
  testlist = [int(q) for q in sys.argv[1:]]
else:
  print ("no test list given, defaulting to 0 1 0 1")
  testlist = [0, 1, 0, 1]

print("PID=", os.getpid())
for i, qtype in enumerate(testlist):
  handlers = []
  if qtype == 0:
que = queue.Queue(-1)
qstring = "queue.Queue"
  else:
que = mp.Queue(-1)
qstring = "mp.Queue"
  handlers.append(logging.handlers.RotatingFileHandler("test%d.log" % (i), 
maxBytes=10, backupCount=5))
  #handlers.append(logging.lastResort)

  listener = logging.handlers.QueueListener(que, *handlers)
  formatter = logging.Formatter('%(asctime)s:%(threadName)s: %(message)s')
  for handler in handlers:
  handler.setFormatter(formatter)
  listener.start()
  
  queue_handler = logging.handlers.QueueHandler(que)
  logger = logging.getLogger()
  logger.setLevel(logging.DEBUG)
  logger.addHandler(queue_handler)
  start = time.time()

  print("[%7.03f] Logging like crazy %d times with que = %s" % 
(time.time()-start, RANGE, qstring))
  for i in range(0,RANGE):
logger.info("AA")
if i % 2000 == 0:
  print(i, "/", RANGE, end='\r')


  print("[%7.03f] Done logging, waiting for completion" % (time.time() - start))
  listener.stop()
  print("[%7.03f] Test finished" % (time.time() - start))
  gc.collect()
  print("---")
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: logging from time critical tasks -- QueueListener().stop() takes the whole CPU

2018-07-15 Thread Gerlando Falauto
On Monday, July 16, 2018 at 6:56:19 AM UTC+2, dieter wrote:
> > ...
> > Why is the main thread taking up so much CPU?
> > I believe at this point listener.stop() should only be waiting for the 
> > helper thread to terminate, which I reckon would be implemented by waiting 
> > on a semaphore or something (i.e. iowait i.e. 0% CPU).
> 
> Maybe, you look at the source code of "listener.stop"?

I did, forgot to mention that. Culprit is self._thread.join().
Which is where it waits for the internal thread to terminate,
which I would've expected to wait on a lock or semaphore (pthread_join()?)
So there's something I'm totally missing here, which has more to do
with queues and threads in general than it has with logging.
Any ideas?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: logging from time critical tasks -- QueueListener().stop() takes the whole CPU

2018-07-15 Thread Gerlando Falauto
On Monday, July 16, 2018 at 8:13:46 AM UTC+2, Thomas Jollans wrote:
> On 16/07/18 07:39, Gerlando Falauto wrote:
> > On Monday, July 16, 2018 at 6:56:19 AM UTC+2, dieter wrote:
> >>> ...
> >>> Why is the main thread taking up so much CPU?
> >>> I believe at this point listener.stop() should only be waiting for the 
> >>> helper thread to terminate, which I reckon would be implemented by 
> >>> waiting on a semaphore or something (i.e. iowait i.e. 0% CPU).
> >>
> >> Maybe, you look at the source code of "listener.stop"?
> > 
> > I did, forgot to mention that. Culprit is self._thread.join().
> > Which is where it waits for the internal thread to terminate,
> > which I would've expected to wait on a lock or semaphore (pthread_join()?)
> > So there's something I'm totally missing here, which has more to do
> > with queues and threads in general than it has with logging.
> > Any ideas?
> > 
> 
> I have no idea what's really going on there, but a quick look through
> the source reveals that the actual waiting in Thread.join() is done by
> sem_wait().
> 
> via
> https://github.com/python/cpython/blob/13ff24582c99dfb439b1af7295b401415e7eb05b/Python/thread_pthread.h#L304
> via
> https://github.com/python/cpython/blob/master/Modules/_threadmodule.c#L45

Hmm... do you think it's possible it's really getting interrupted the whole 
time, effectively turning the lock into a sort of spinlock?
Any idea how to confirm that without recompiling the whole code?
-- 
https://mail.python.org/mailman/listinfo/python-list