Re: array.array()'s memory shared with multiprocessing.Process()
> > I suspect it's down to timing. > > What you're putting into the queue is a reference to the array, and it's > only some time later that the array itself is pickled and then sent (the > work being done in the 'background'). > > Modifying the array before (or while) it's actually being sent would > explain the problem you're seeing. That would also have been my guess. However, according to documentation: > When an object is put on a queue, the object is pickled and a background > thread later flushes the pickled data to an underlying pipe. In my understanding this means the object is pickled *before* the background thread takes care of flushing the data to the pipe. Is that a mistake in the documentation then? Any suggestion for a way to work around this limitation? Or perhaps a different approach altogether I could use to reduce CPU load? What the main thread actually does is dequeue data from a high-speed USB-to-serial (2.000.000 bps), that's why I came up with the array.array() solution to store collected data, hoping for the smallest possible overhead. Thanks! -- https://mail.python.org/mailman/listinfo/python-list
Re: array.array()'s memory shared with multiprocessing.Process()
Il giorno lunedì 11 settembre 2017 12:19:27 UTC+2, Thomas Jollans ha scritto: > On 2017-09-10 23:05, iurly wrote: > > As far as I'm concerned, I'm probably better off using double buffers to > > avoid this kind of issues. > > Thanks a lot for your help! > > > > > That should work. Some other things to consider, if both processes are > on the same machine, are a series of memory-mapped files to pass the > data without pickling, or, if your code is only going to run on Unix, > something involving shared memory through multiprocessing.Array. Oh, I see, multiprocessing.Array() sounds like a pretty good idea, thanks! It's simple enough and actually already compatible with what I'm doing. That would involve double buffers which I would have to use as the first optimization step anyway. Notice however how I'd have to create those Arrays dynamically in the producer thread. Would I then be able to pass them to the consumer by putting a reference in a queue? I wouldn't want them to be pickled at all in that case, of course. Also, why do you say it only works on Unix? I couldn't find any reference to such limitation in the documentation. Thank you so much again! -- https://mail.python.org/mailman/listinfo/python-list
logging from time critical tasks -- QueueListener().stop() takes the whole CPU
Hi, I'm adding logging to a time critical task running resource-constrained hardware (Raspberry Pi). I read about the QueueListener/QueueHandler in: https://docs.python.org/3/howto/logging-cookbook.html#dealing-with-handlers-that-block and I'm trying to understand how it really works and what's happening under the hood, and what the impact would be. So I wrote a test program (here at the bottom) that logs 50 times and then waits for the listener to terminate. I ran it on a PC: $ python3 logtest.py 0 PID= 5974 [ 0.000] Logging like crazy 50 times with que = queue.Queue [ 18.038] Done logging, waiting for completion [ 37.824] Test finished --- Here's the output while it's logging (within the first 18 seconds): $ ps um -L 5974 USERPIDLWP %CPU NLWP %MEMVSZ RSS TTY STAT START TIME COMMAND iurly 5974 - 96.32 2.1 264232 170892 pts/2 -15:30 0:07 python3 logtest.py 0 iurly - 5974 72.8-- - - -Rl+ 15:30 0:05 - iurly - 5975 23.3-- - - -Sl+ 15:30 0:01 - So the main thread is taking most of the CPU while the background thread is writing to disk, and that's reasonable. However, as soon as I start waiting for the logger terminate, I get something like this: $ ps um -L 5974 USERPIDLWP %CPU NLWP %MEMVSZ RSS TTY STAT START TIME COMMAND iurly 5974 - 1002 3.9 406724 313588 pts/2 -15:30 0:31 python3 logtest.py 0 iurly - 5974 45.2-- - - -Sl+ 15:30 0:14 - iurly - 5975 54.8-- - - -Rl+ 15:30 0:17 - Why is the main thread taking up so much CPU? I believe at this point listener.stop() should only be waiting for the helper thread to terminate, which I reckon would be implemented by waiting on a semaphore or something (i.e. iowait i.e. 0% CPU). Thank you, Gerlando RANGE=50 import logging import logging.handlers import multiprocessing as mp import queue import time import gc import sys import os if len(sys.argv) > 1: testlist = [int(q) for q in sys.argv[1:]] else: print ("no test list given, defaulting to 0 1 0 1") testlist = [0, 1, 0, 1] print("PID=", os.getpid()) for i, qtype in enumerate(testlist): handlers = [] if qtype == 0: que = queue.Queue(-1) qstring = "queue.Queue" else: que = mp.Queue(-1) qstring = "mp.Queue" handlers.append(logging.handlers.RotatingFileHandler("test%d.log" % (i), maxBytes=10, backupCount=5)) #handlers.append(logging.lastResort) listener = logging.handlers.QueueListener(que, *handlers) formatter = logging.Formatter('%(asctime)s:%(threadName)s: %(message)s') for handler in handlers: handler.setFormatter(formatter) listener.start() queue_handler = logging.handlers.QueueHandler(que) logger = logging.getLogger() logger.setLevel(logging.DEBUG) logger.addHandler(queue_handler) start = time.time() print("[%7.03f] Logging like crazy %d times with que = %s" % (time.time()-start, RANGE, qstring)) for i in range(0,RANGE): logger.info("AA") if i % 2000 == 0: print(i, "/", RANGE, end='\r') print("[%7.03f] Done logging, waiting for completion" % (time.time() - start)) listener.stop() print("[%7.03f] Test finished" % (time.time() - start)) gc.collect() print("---") -- https://mail.python.org/mailman/listinfo/python-list
Re: logging from time critical tasks -- QueueListener().stop() takes the whole CPU
On Monday, July 16, 2018 at 6:56:19 AM UTC+2, dieter wrote: > > ... > > Why is the main thread taking up so much CPU? > > I believe at this point listener.stop() should only be waiting for the > > helper thread to terminate, which I reckon would be implemented by waiting > > on a semaphore or something (i.e. iowait i.e. 0% CPU). > > Maybe, you look at the source code of "listener.stop"? I did, forgot to mention that. Culprit is self._thread.join(). Which is where it waits for the internal thread to terminate, which I would've expected to wait on a lock or semaphore (pthread_join()?) So there's something I'm totally missing here, which has more to do with queues and threads in general than it has with logging. Any ideas? -- https://mail.python.org/mailman/listinfo/python-list
Re: logging from time critical tasks -- QueueListener().stop() takes the whole CPU
On Monday, July 16, 2018 at 8:13:46 AM UTC+2, Thomas Jollans wrote: > On 16/07/18 07:39, Gerlando Falauto wrote: > > On Monday, July 16, 2018 at 6:56:19 AM UTC+2, dieter wrote: > >>> ... > >>> Why is the main thread taking up so much CPU? > >>> I believe at this point listener.stop() should only be waiting for the > >>> helper thread to terminate, which I reckon would be implemented by > >>> waiting on a semaphore or something (i.e. iowait i.e. 0% CPU). > >> > >> Maybe, you look at the source code of "listener.stop"? > > > > I did, forgot to mention that. Culprit is self._thread.join(). > > Which is where it waits for the internal thread to terminate, > > which I would've expected to wait on a lock or semaphore (pthread_join()?) > > So there's something I'm totally missing here, which has more to do > > with queues and threads in general than it has with logging. > > Any ideas? > > > > I have no idea what's really going on there, but a quick look through > the source reveals that the actual waiting in Thread.join() is done by > sem_wait(). > > via > https://github.com/python/cpython/blob/13ff24582c99dfb439b1af7295b401415e7eb05b/Python/thread_pthread.h#L304 > via > https://github.com/python/cpython/blob/master/Modules/_threadmodule.c#L45 Hmm... do you think it's possible it's really getting interrupted the whole time, effectively turning the lock into a sort of spinlock? Any idea how to confirm that without recompiling the whole code? -- https://mail.python.org/mailman/listinfo/python-list