[issue33725] Python crashes on macOS after fork with no exec

2020-03-29 Thread Mouse


Mouse  added the comment:

The fix applied for this problem actually broke multiprocessing on MacOS. The 
change to the new default 'spawn' from 'fork' causes program to crash in 
spawn.py with `FileNotFoundError: [Errno 2] No such file or directory`.

I've tested this on MacOS Catalina 10.15.3 and 10.15.4, with Python-3.8.2 and 
Python-3.7.7.

With Python-3.7.7 everything works as expected.

Here's the output:
{{{
$ python3.8 multi1.py 
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/spawn.py",
 line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
  File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/spawn.py",
 line 126, in _main
self = reduction.pickle.load(from_parent)
  File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/synchronize.py",
 line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
}}}

Here's the program:
{{{
#!/usr/bin/env python3
#
# Test "multiprocessing" package included with Python-3.6+
#
# Usage:
#./mylti1.py [nElements [nProcesses [tSleep]]]
#
#nElements  - total number of integers to put in the queue
# default: 100
#nProcesses - total number of parallel processes/threads
# default: number of physical cores available
#tSleep - number of milliseconds for a thread to sleep
# after it retrieved an element from the queue
# default: 17
#
# Algorithm:
#   1. Creates a queue and adds nElements integers to it,
#   2. Creates nProcesses threads
#   3. Each thread extracts an element from the queue and sleeps for tSleep 
milliseconds
#

import sys, queue, time
import multiprocessing as mp


def getElements(q, tSleep, idx):
l = []  # list of pulled numbers
while True:
try:
l.append(q.get(True, .001))
time.sleep(tSleep)
except queue.Empty:
if q.empty():
print(f'worker {idx} done, got {len(l)} numbers')
return


if __name__ == '__main__':
nElements = int(sys.argv[1]) if len(sys.argv) > 1 else 100
nProcesses = int(sys.argv[2]) if len(sys.argv) > 2 else mp.cpu_count()
tSleep = float(sys.argv[3]) if len(sys.argv) > 3 else 17

# Uncomment the following line to make it working with Python-3.8+
#mp.set_start_method('fork')

# Fill the queue with numbers from 0 to nElements
q = mp.Queue()
for k in range(nElements):
q.put(k)

# Start worker processes
for m in range(nProcesses):
p = mp.Process(target=getElements, args=(q, tSleep / 1000, m))
p.start()
}}}

--
nosy: +mouse07410
type:  -> crash

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2020-03-29 Thread Mouse


Mouse  added the comment:

Tried 'spawn', 'fork', 'forkserver'. 

- 'spawn' causes consistent `FileNotFoundError: [Errno 2] No such file or 
directory`;
- 'fork' consistently works (tested on machines with 4 and 20 cores);
- 'forkserver' causes roughly half of the processes to crash with 
`FileNotFoundError`, the other half succeeds (weird!).

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2020-03-29 Thread Mouse


Mouse  added the comment:

@mark.dickinson, the issue you referred to did not show a working sample. Could 
you demonstrate on my example how it should be applied? Thanks!

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2020-03-29 Thread Mouse


Mouse  added the comment:

Also, adding `p.join()` immediately after `p.start()` in my sample code showed 
this timing:
```
$ time python3.8 multi1.py 
worker 0 done, got 100 numbers
worker 1 done, got 0 numbers
worker 2 done, got 0 numbers
worker 3 done, got 0 numbers

real0m2.342s
user0m0.227s
sys 0m0.111s
$ 
```

Setting instead start to `fork` showed this timing:
```
$ time python3.8 multi1.py 
worker 2 done, got 25 numbers
worker 0 done, got 25 numbers
worker 1 done, got 25 numbers
worker 3 done, got 25 numbers

real0m0.537s
user0m0.064s
sys 0m0.040s
$ 
```

The proposed fix is roughly four times slower, compared to reverting start to 
`fork`.

--

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40106] multiprocessor spawn

2020-03-29 Thread Mouse


New submission from Mouse :

MacOS Catalina 10.15.3 and 10.15.4. Python-3.8.2 (also tested with 3.7.7, which 
confirmed the problem being in the fix described in 
https://bugs.python.org/issue33725.

Trying to use "multiprocessor" with Python-3.8 and with the new default of 
`set_start_method('spawn')` is nothing but a disaster.

Not doing join() leads to consistent crashes, like described here 
https://bugs.python.org/issue33725#msg365249

Adding p.join() immediately after p.start() seems to work, but increases the 
total run-time by factor between two and four, user time by factor of five, and 
system time by factor of ten. 

Occasionally even with p.join() I'm getting some processes crashing like  shown 
in https://bugs.python.org/issue33725#msg365249. 

I found two workarounds:
1. Switch back to 'fork' by explicitly adding `set_start_method('fork') to the 
__main__.
2. Drop the messy "multiprocessing" package and use "multiprocess" instead, 
which turns out to be a good and reliable fork of "multiprocessing".

If anybody cares to dig deeper into this problem, I'd be happy to provide 
whatever information that could be helpful.

Here's the sample code (again):
```
#!/usr/bin/env python3
#
# Test "multiprocessing" package included with Python-3.6+
#
# Usage:
#./mylti1.py [nElements [nProcesses [tSleep]]]
#
#nElements  - total number of integers to put in the queue
# default: 100
#nProcesses - total number of parallel processes/threads
# default: number of physical cores available
#tSleep - number of milliseconds for a thread to sleep
# after it retrieved an element from the queue
# default: 17
#
# Algorithm:
#   1. Creates a queue and adds nElements integers to it,
#   2. Creates nProcesses threads
#   3. Each thread extracts an element from the queue and sleeps for tSleep 
milliseconds
#

import sys, queue, time
import multiprocessing as mp


def getElements(q, tSleep, idx):
l = []  # list of pulled numbers
while True:
try:
l.append(q.get(True, .001))
time.sleep(tSleep)
except queue.Empty:
if q.empty():
print(f'worker {idx} done, got {len(l)} numbers')
return


if __name__ == '__main__':
nElements = int(sys.argv[1]) if len(sys.argv) > 1 else 100
nProcesses = int(sys.argv[2]) if len(sys.argv) > 2 else mp.cpu_count()
tSleep = float(sys.argv[3]) if len(sys.argv) > 3 else 17

# To make this sample code work reliably and fast, uncomment following line
#mp.set_start_method('fork')

# Fill the queue with numbers from 0 to nElements
q = mp.Queue()
for k in range(nElements):
q.put(k)

# Keep track of worker processes
workers = []

# Start worker processes
for m in range(nProcesses):
p = mp.Process(target=getElements, args=(q, tSleep / 1000, m))
workers.append(p)
p.start()

# Now do the joining
for p in workers:
p.join()
```

Here's the timing:
```
$ time python3 multi1.py
worker 9 done, got 5 numbers
worker 16 done, got 5 numbers
worker 6 done, got 5 numbers
worker 8 done, got 5 numbers
worker 17 done, got 5 numbers
worker 3 done, got 5 numbers
worker 14 done, got 5 numbers
worker 0 done, got 5 numbers
worker 15 done, got 4 numbers
worker 7 done, got 5 numbers
worker 5 done, got 5 numbers
worker 12 done, got 5 numbers
worker 4 done, got 5 numbers
worker 19 done, got 5 numbers
worker 18 done, got 5 numbers
worker 1 done, got 5 numbers
worker 10 done, got 5 numbers
worker 2 done, got 5 numbers
worker 11 done, got 6 numbers
worker 13 done, got 5 numbers

real0m0.325s
user0m1.375s
sys 0m0.692s
```

If I comment out the join() and uncomment set_start_method('fork'), the timing 
is
```
$ time python3 multi1.py
worker 0 done, got 5 numbers
worker 3 done, got 5 numbers
worker 2 done, got 5 numbers
worker 1 done, got 5 numbers
worker 5 done, got 5 numbers
worker 10 done, got 5 numbers
worker 6 done, got 5 numbers
worker 4 done, got 5 numbers
worker 7 done, got 5 numbers
worker 9 done, got 5 numbers
worker 8 done, got 5 numbers
worker 14 done, got 5 numbers
worker 11 done, got 5 numbers
worker 12 done, got 5 numbers
worker 13 done, got 5 numbers
worker 16 done, got 5 numbers
worker 15 done, got 5 numbers
worker 17 done, got 5 numbers
worker 18 done, got 5 numbers
worker 19 done, got 5 numbers

real0m0.175s
user0m0.073s
sys 0m0.070s
```

You can observe the difference.

Here's the timing if I don't bother with either join() or set_start_method(), 
but import "multiprocess" instead:
```
$ time python3 multi2.py 
worker 0 done, got 5 numbers
worker 1 done, got 5 numbers
worker 2 done, got 5 numbers
worker 4 done, got 5 numbers
worker 3 do

[issue28965] Multiprocessing spawn/forkserver fails to pass Queues

2020-03-29 Thread Mouse


Mouse  added the comment:

On MacOS Catalina 10.15.4, I still see this problem occasionally even with 
p.join() added. See https://bugs.python.org/msg365251 and subsequent messages.

Also, see https://bugs.python.org/issue40106.

--
nosy: +mouse07410

___
Python tracker 
<https://bugs.python.org/issue28965>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33725] Python crashes on macOS after fork with no exec

2020-03-29 Thread Mouse


Mouse  added the comment:

@mark.dickinson, thank you. Following your suggestion, I've added a comment in 
#28965, and created a new issue https://bugs.python.org/issue40106.

--
nosy: +vstinner

___
Python tracker 
<https://bugs.python.org/issue33725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-10-27 Thread Mouse

New submission from Mouse:

binascii b2a_base64() documentation says:

The length of data should be at most 57 to adhere to the base64 standard.

This is incorrect, because there is no base64 standard that restricts the 
length of input data, especially to such a small value.

What RFC4648 (that superseded RFC3548 that your documentation still keeps 
referring to) actually says is that MIME enforces the limit ofthe OUTPUT LINE 
length at 76, but NOT of the entire output, and certainly not of the entire 
input.

Please correct the documentation, making it conformant with what the ACTUAL 
base64 standard says.

See https://en.wikipedia.org/wiki/Base64 and
https://tools.ietf.org/html/rfc4648

Thanks!

--
assignee: docs@python
components: Documentation
messages: 253572
nosy: docs@python, mouse07410
priority: normal
severity: normal
status: open
title: binascii documentation incorrect
versions: Python 3.5

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-10-28 Thread Mouse

Mouse added the comment:

Yes I know where this came from. :-)

Here is my proposed change.

Replace the statement 

The length of data should be at most 57 to adhere to the base64 standard.

with:

To be MIME-compliant, the Base64 output (as defined in RFC4648) should be 
broken into lines of at most 76 characters long. This post-processing of the 
output is the responsibility of the caller. Note that the original PEM 
context-transfer encoding limited line length to 64 characters.


Would this change be agreeable to you?

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-10-28 Thread Mouse

Mouse added the comment:

Thank you for the quick turn-around, and for taking care of this issue!

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-10-30 Thread Mouse

Mouse added the comment:

As far as I remember, the data was not "originally processed in 57-byte 
chunks". I've been around the first PEM and MIME standards and discussions (and 
code, though not in Python, which wasn't around then) to be in position to 
know. :)

Whether the user prefers to process data in chunks or not, is up to the user. 
Not to mention that PEM is long gone, and MIME also changed somewhat. 

The link between this function and RFC4648 can and should be more explicit, but 
I think just referring to it is enough. 

Do you have a recommendation for additional info to explain newline issues?

Yes, changing "Base64 output" to "function output" makes perfect sense.

Thanks!

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-02 Thread Mouse

Mouse added the comment:

1. I concede knowing nothing about the early Python library implementation, 
functionality, or even purpose.

2. I don't think it makes sense now to either refer to PEM. We'd be two decades 
too late for that (well, 27 years, to be precise :). See
 https://en.wikipedia.org/wiki/Privacy-enhanced_Electronic_Mail

3. I don't think we are in position to tell programmers how to split a string 
of characters into 76-long chunks. Not to mention that the example you gave is 
likely to suffer in performance (just count those function calls), compared to 
other methods, and won't reflect well on the authors.

Here's one possible doc version:

'''
Convert binary data to the base 64 encoding defined in :rfc:`4648`. The return 
value includes a trailing newline ``b"\n"`` if *newline* is true.

If the output is used as Base64 transfer encoding for MIME (:rfc: 2045), base 
64 output should be broken into lines at most 76 characters long to be 
compliant. Base64 encoding standard does not limit the maximum encoded line 
length.
'''

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-02 Thread Mouse

Mouse added the comment:

Let's not insinuate anything about the input. This is about what constraints on 
the OUTPUT MAY be there, not a tutorial from the 80-ties on how one might 
accomplish it.

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-02 Thread Mouse

Mouse added the comment:

And even those constraints depend on the use. E.g. X.509 does not have those.

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-02 Thread Mouse

Mouse added the comment:

1. I am OK with the following text, modeling referred Perldoc:

b2a_base64( $bytes, $eol );

Encode data by calling the encode_base64() function. The first argument is the 
byte string to encode. 

The second argument is optional, and provides the line-ending sequence to use. 
When it is given, the returned encoded string is broken into lines of no more 
than 76 characters each and it will end with $eol unless it is empty. Pass an 
empty string, or no second argument at all if you do not want the encoded 
string to be broken into lines.

2. I already had people telling me that "Python-3 doc prohibits input longer 
than 57 bytes, even though it doesn't currently enforce it". Please help 
putting end to spreading of this confusion.

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-03 Thread Mouse

Mouse added the comment:

The harm in mentioning the 57-byte chunking is that so far it successfully 
confused people. 

b2a_base64() function is not coupled to MIME. It has no constraints on either 
its input, or its output. *IF* it is used by (in) a MIME application, then the 
caller may want to make its output RFC 2045-compliant, by whatever way he 
chooses. Giving (an unwelcome) advice to a writer of one specific application 
is in my opinion completely out of scope here. Justification that it used to 
matter 25 years ago and therefore should be kept here doesn't make sense to me.

I strongly insist that this "chunking" thing does not belong, and must be 
removed.

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-05 Thread Mouse

Mouse added the comment:

Unfortunately, NO. The problem (and this bug report) is for Python-3 
documentation, so trying to address it in Python-2 rather than in Python-3 does 
not make sense.

We seem to both understand and agree that there is no length limitation on 
b2a_base64() input, either recommended or enforced - contrary to what the 
current Python-3 documentation implies.

We understand that *if* the *output* of this function is intended for use in 
MIME (rather than X.509 or whatever else Base64 is good for), then the caller 
should do other things besides calling b2a_base64(), and in all likelihood the 
caller is already aware of that - after all, if he figured that he needs Base64 
in his stuff, he probably knows something about what MIME standards say and 
require?. 

I repeat my original complaint: Python-3 documentation is buggy because it 
implies a restriction on the input that is not there. This reference should be 
removed from there because it confuses people. 
I've talked to those confused personally, so this is first-hand.

I refer you to the original msg253572 of this bug report.

If you want to write a MIME-in-Python tutorial, it is up to you - but 
b2a_base64() does not seem to be the right place for it.  
(And I'd rather see an X.509 tutorial if you're dead set on writing something 
besides strict plain b2a_base64() doc. :-)

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-05 Thread Mouse

Mouse added the comment:

To add: I do not understand your attachment to that 57 "...(exactly 57 bytes of 
input data per line)", and request that this parenthesized sentence is removed 
from your Python-2.7 doc patch. 

Please give the reader the benefit of the doubt, and allow that *if* he wants 
to repeatedly call b2a_base64() instead of splitting its output - the ability 
to compute (76 * 3 / 4) is within his skill level.

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-06 Thread Mouse

Mouse added the comment:

> my patch should be valid for 3.5 also.
> The relevant wording is identical to 2.7.

OK.

> I have resisted removing the magic number 57 for a couple
> of reasons. Reading existing code that uses this number may
> be harder.

You expect to see "existing code that uses this number" in Python-3.5+? 
Interesting... (Care to point me at a couple of samples of such "existing" 
Python-3 code?) And you expect that the main info source for understanding the 
reason behind that "57" (assuming this function is invoked that way, as opposed 
to splitting the output :) would be the doc for this function, rather than the 
main program, or RFC 2045, or...? Fine.

> It helps explain how the function was originally to be used,
> and why the newline is appended.

Pardon me, but why do you think anybody would care...? There are tons of 
functions, old and new, with more new ones popping up fast enough. I'd really 
envy a person who has time to enjoy history of one minuscule function of an old 
(albeit still useful :) library.

OK. You think a history of this function should be documented - fine. I don't 
need it (and don't think anybody else wants to read it either), but it's not my 
doc or my decision.

Just get the darn bug fixed.

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25495] binascii documentation incorrect

2015-11-17 Thread Mouse

Mouse added the comment:

Status...?

--

___
Python tracker 
<http://bugs.python.org/issue25495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com