Re: Sorting NaNs

2018-06-07 Thread Gregory Ewing

Steven D'Aprano wrote:
But if it were (let's say) 1 ULP greater or less 
than one half, would we even know?


In practice it's probably somewhat bigger than 1 ULP.
A typical PRNG will first generate a 32-bit integer and
then map it to a float, giving a resolution coarser than
the 52 bits of an IEEE double.

But even then, the probability of getting exactly 0.5
is only 1/2^32, which you're not likely to notice.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


EuroPython 2018: Training pass sale starts on Friday at 12:00 CEST

2018-06-07 Thread M.-A. Lemburg
As we have already announced, access to trainings is not included in
our regular conference tickets this year. We have done this to keep
the conference ticket prices reasonable, acknowledge the value in the
trainings are and to add more flexibility.


Two full days of trainings included
---

Training passes allow you to access all trainings on the two training
days, Monday and Tuesday. Each training will run for 3 hours and they
will be scheduled in 3 parallel tracks.

Please note: Training access is on a first-come-first-served basis. We
don’t provide registration for specific training sessions.

A light lunch is served on the training days, which is included in the
training pass price.

* Business training pass: EUR 295.00 excl. VAT, EUR 354.00 incl. 20%
  UK VAT (for people using Python to make a living)

* Personal training pass: EUR 175.00 incl. 20% UK VAT (for people
  enjoying Python from home)

* Student training pass: EUR 125.00 incl. 20% UK VAT (only available
  for pupils, students and postdoctoral researchers; please bring your
  student card or declaration from University, stating your
  affiliation, starting and end dates of your contract)

The trainings pass does not grant you permission to attend the main
EuroPython conference days or the sprints. Please get a separate
conference ticket for this.


Available training sessions
---

We have already selected an initial set of trainings, you can view on
our session list:

  https://ep2018.europython.eu/en/events/sessions/#Training-sessions

We will also have a few sponsored trainings, which are free (you don’t
need a training pass to attend these). These will be announced in a
separate blog post.


Only a limited number of training passes available
--

Since we don’t want to overbook the trainings sessions, we have
limited the number of training passes to 200.

Training pass sales will start on Friday, June 6, at around 12:00
CEST (that’s 10:00 UTC, 11:00 BST, 13:00 EEST, etc.).

Given the experience with the early bird tickets, which sold out in
less than 45 minutes, we recommend to get your ticket as soon as you
can.

Regular conference tickets are selling out much faster than last year
as well, so the same recommendation applies to those as well:

  https://ep2018.europython.eu/en/registration/buy-tickets/



Help spread the word


Please help us spread this message by sharing it on your social
networks as widely as possible. Thank you !

Link to the blog post:

https://blog.europython.eu/post/174655358657/europython-2018-training-pass-sale-starts-on

Tweet:

https://twitter.com/europython/status/1004625811335458818

Enjoy,
--
EuroPython 2018 Team
https://ep2018.europython.eu/
https://www.europython-society.org/

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Chris Angelico
On Thu, Jun 7, 2018 at 1:55 PM, Steven D'Aprano
 wrote:
> On Tue, 05 Jun 2018 23:27:16 +1000, Chris Angelico wrote:
>
>> And an ASCIIZ string cannot contain a byte value of zero. The parallel
>> is exact.
>
> Why should we, as Python programmers, care one whit about ASCIIZ strings?
> They're not relevant. You might as well say that file names cannot
> contain the character "π" because ASCIIZ strings don't support it.
>
> No they don't, and yet nevertheless file names can and do contain
> characters outside of the ASCIIZ range.

Under Linux, a file name contains bytes, most commonly representing
UTF-8 sequences. So... an ASCIIZ string *can* contain that character,
or at least a representation of it. Yet it cannot contain "\0".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Sorting NaNs

2018-06-07 Thread Chris Angelico
On Thu, Jun 7, 2018 at 2:14 PM, Steven D'Aprano
 wrote:
> On Sat, 02 Jun 2018 21:02:14 +1000, Chris Angelico wrote:
>
>> Point of curiosity: Why "> 0.5"?
>
> No particular reason, I just happened to hit that key and then copied and
> pasted the line into the next one.

Hah! The simplicity of it.

>> Normally when I want a fractional
>> chance, I write the comparison the other way: "random.random() < 0.5"
>> has exactly a 50% chance of occurring (presuming that random.random()
>> follows its correct documented distribution). I've no idea what the
>> probability of random.random() returning exactly 0.5 is
>
> Neither do I. But I expect that "exactly 50% chance" is only
> approximately true :-)

Oh, I have no doubt about that. (Though as Gregory says, the
uniformity is based on the PRNG more than on any enumeration of
floats. There are probably floating-point values between 0.0 and 1.0
that can never actually be returned.)

> So given that our mathematically pure(ish) probability of 0.5 for the
> reals has to be mapped in some way to a finite number of floats, I
> wouldn't want to categorically say that that the probability remains
> *precisely* one half. But if it were (let's say) 1 ULP greater or less
> than one half, would we even know?
>
> 0.5 - 1 ULP = 0.49994
>
> 0.5 + 1 ULP = 0.5001
>
>
> I would love to see the statistical experiment that could distinguish
> those two probabilities from exactly 1/2 to even a 90% confidence
> level :-)

LOL, no kidding. How many RNG rolls would you need before you could
even distinguish between 50-50 and 49-51% chance? How many people have
done any sort of statistical analysis even that accurate?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Antoon Pardon
On 07-06-18 05:55, Steven D'Aprano wrote:
> Python strings are rich objects which support the Unicode code point \0 
> in them. The limitation of the Linux kernel that it relies on NULL-
> terminated byte strings is irrelevant to the question of what 
> os.path.exists ought to do when given a path containing NUL. Other 
> invalid path names return False.

It is not irrelevant. It makes the disctinction clear between possible
values and impossible values. Now you personnaly may find that distinction
of minor importance but it is a relevant distinction in discussing how
to treat it.

> As a Python programmer, how does treating NUL specially make our life 
> better?

By treating possible path values differently from impossible path values.

-- 
Antoon.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Marko Rauhamaa
Antoon Pardon :

> On 07-06-18 05:55, Steven D'Aprano wrote:
>> As a Python programmer, how does treating NUL specially make our life
>> better?
>
> By treating possible path values differently from impossible path
> values.

There are all kinds of impossibility. The os.stat() reports those
impossibilities via an OSError exception. It's just that
os.path.exists() converts the OSError exception into a False return
value. A ValueError is raised by the Python os.stat() wrapper to
indicate that it can't even deliver the request to the kernel.

The application programmer doesn't give an iota who determined the
impossibility of a pathname. Unfortunately, os.path.exists() forces the
distinction on the application. If I have to be prepared to catch a
ValueError from os.path.exists(), what added value does os.path.exists()
give on top of os.stat()? The whole point of os.path.exists() is

  1. To provide an operating-system-independent abstraction.

  2. To provide a boolean interface instead of an exception interface.



This is a security risk. Here is a brief demonstration. Copy the example
HTTP server from:

   https://docs.python.org/3/library/http.server.html?highlight=h
   ttp#http.server.SimpleHTTPRequestHandler>

Run the server. Try these URLs in your browser:

  1. http://localhost:8000/

 => The directory listing is provided

  2. http://localhost:8000/test.html

 => A file is served or an HTTP error response (404) is generated

  3. http://localhost:8000/te%00st.html

 => The server crashes with a ValueError and the TCP connection is
reset


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Marko Rauhamaa
Marko Rauhamaa :

> This is a security risk. Here is a brief demonstration. Copy the example
> HTTP server from:
>
>https://docs.python.org/3/library/http.server.html?highlight=h
>ttp#http.server.SimpleHTTPRequestHandler>
>
> [...]
>
>   3. http://localhost:8000/te%00st.html
>
>  => The server crashes with a ValueError and the TCP connection is
> reset

An exercise for the reader: provide a fix for the example server so the
request returns a 404 response just like any other nonexistent resource.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Chris Angelico
On Thu, Jun 7, 2018 at 7:29 PM, Marko Rauhamaa  wrote:
> This is a security risk. Here is a brief demonstration. Copy the example
> HTTP server from:
>
>https://docs.python.org/3/library/http.server.html?highlight=h
>ttp#http.server.SimpleHTTPRequestHandler>
>
> Run the server. Try these URLs in your browser:
>
>   1. http://localhost:8000/
>
>  => The directory listing is provided
>
>   2. http://localhost:8000/test.html
>
>  => A file is served or an HTTP error response (404) is generated
>
>   3. http://localhost:8000/te%00st.html
>
>  => The server crashes with a ValueError and the TCP connection is
> reset
>

Actually, I couldn't even get Chrome to make that request, so it
obviously was considered by the browser to be invalid. Doing the
request with curl produced a traceback on the server and an empty
response in the client. (And then the server returns to handling
requests normally.) How is this a security risk, exactly? To be fair,
it's somewhat unideal behaviour - I would prefer to see an HTTP 500
come back if the server crashes - but I can't see that that's a
security problem. Just a QOS issue, wherein you might get a 500 rather
than a 404 for certain requests.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Antoon Pardon
On 07-06-18 11:29, Marko Rauhamaa wrote:
> Antoon Pardon :
>
>> On 07-06-18 05:55, Steven D'Aprano wrote:
>>> As a Python programmer, how does treating NUL specially make our life
>>> better?
>> By treating possible path values differently from impossible path
>> values.
> There are all kinds of impossibility. The os.stat() reports those
> impossibilities via an OSError exception. It's just that
> os.path.exists() converts the OSError exception into a False return
> value. A ValueError is raised by the Python os.stat() wrapper to
> indicate that it can't even deliver the request to the kernel.
>
> The application programmer doesn't give an iota who determined the
> impossibility of a pathname.

So? The fact that the application programmer doesn't give an iota who
determined the impossibility of a pathname, doesn't imply he is
equally unconcerned about the specific impossibility he ran into.

> Unfortunately, os.path.exists() forces the
> distinction on the application.

No it doesn't. It forces the distinction between two different kinds
of impossibilities, but you don't have to care where they originate
from.

>  If I have to be prepared to catch a
> ValueError from os.path.exists(), what added value does os.path.exists()
> give on top of os.stat()? The whole point of os.path.exists() is
>
>   1. To provide an operating-system-independent abstraction.
>
>   2. To provide a boolean interface instead of an exception interface.

Mayby trying to provide such an interface is inherently flawed. Answering
me a path doesn't exist because of a permission problem is IMO not a good
idea.

--
Antoon.


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Marko Rauhamaa
Chris Angelico :

> On Thu, Jun 7, 2018 at 7:29 PM, Marko Rauhamaa  wrote:
>>   3. http://localhost:8000/te%00st.html
>>
>>  => The server crashes with a ValueError and the TCP connection is
>> reset
>>
>
> Actually, I couldn't even get Chrome to make that request, so it
> obviously was considered by the browser to be invalid.

Wow! Why on earth?

> it's somewhat unideal behaviour - I would prefer to see an HTTP 500
> come back if the server crashes - but I can't see that that's a
> security problem. Just a QOS issue, wherein you might get a 500 rather
> than a 404 for certain requests.

It's a demonstration of how this innocent-looking problem can lead to
surprising and even serious consequences.

The given URI is well-formed and should not give any particular trouble
to any HTTP server.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


round

2018-06-07 Thread ast

Hi

round is supposed to provide an integer when
called without any precision argument.

here is the doc:

>>> help(round)

round(number[, ndigits]) -> number

Round a number to a given precision in decimal digits (default 0 digits).
This returns an int when called with one argument, otherwise the
same type as the number

but in some circumstances it provides a float

import numpy as np

M = np.array([[0, 9],[2, 7]], dtype=int)
np.linalg.det(M)
-18.004
round(np.linalg.det(M))
-18.0 # i was expecting an integer -18, not a float

# same problem with np.round
np.round(np.linalg.det(M))
-18.0
--
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Chris Angelico
On Thu, Jun 7, 2018 at 8:47 PM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> On Thu, Jun 7, 2018 at 7:29 PM, Marko Rauhamaa  wrote:
>>>   3. http://localhost:8000/te%00st.html
>>>
>>>  => The server crashes with a ValueError and the TCP connection is
>>> reset
>>>
>> it's somewhat unideal behaviour - I would prefer to see an HTTP 500
>> come back if the server crashes - but I can't see that that's a
>> security problem. Just a QOS issue, wherein you might get a 500 rather
>> than a 404 for certain requests.
>
> It's a demonstration of how this innocent-looking problem can lead to
> surprising and even serious consequences.
>
> The given URI is well-formed and should not give any particular trouble
> to any HTTP server.

You haven't demonstrated a security problem. Don't claim security
risks unless you can show there's at least a possibility of that;
otherwise, it's just FUD.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: round

2018-06-07 Thread Lutz Horn

M = np.array([[0, 9],[2, 7]], dtype=int)
np.linalg.det(M)
-18.004
round(np.linalg.det(M))


np.linalg.det(M) has type numpy.float64, not float. Try this:


round(float(np.linalg.det(M)))

-18

Lutz

--
https://mail.python.org/mailman/listinfo/python-list


Re: round

2018-06-07 Thread Peter Otten
ast wrote:

> Hi
> 
> round is supposed to provide an integer when
> called without any precision argument.
> 
> here is the doc:
> 
>  >>> help(round)
> 
> round(number[, ndigits]) -> number
> 
> Round a number to a given precision in decimal digits (default 0 digits).
> This returns an int when called with one argument, otherwise the
> same type as the number

That's not the complete story. Quoting 
https://docs.python.org/dev/library/functions.html#round

"""
For a general Python object number, round delegates to number.__round__.
"""

Bogus example to make the point:

>>> class A:
... def __round__(self): return "whatever"
... 
>>> round(A())
'whatever'

> but in some circumstances it provides a float
> 
> import numpy as np
> 
> M = np.array([[0, 9],[2, 7]], dtype=int)
> np.linalg.det(M)
> -18.004
> round(np.linalg.det(M))
> -18.0 # i was expecting an integer -18, not a float
> 
> # same problem with np.round
> np.round(np.linalg.det(M))
> -18.0

>>> M = np.array([[0, 9],[2, 7]], dtype=int)
>>> type(np.linalg.det(M))


So numpy.linalg.det() returns a custom type float64 which maps round() to 
float64:

>>> round(np.float64(1.23))
1.0
>>> type(_)



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 19:47:03 +1000, Chris Angelico wrote:

> To be fair, it's somewhat unideal behaviour - I would prefer to see an
> HTTP 500 come back if the server crashes - but I can't see that that's a
> security problem.

You think that being able to remotely crash a webserver isn't a security 
issue?


If Denial Of Service isn't a security issue in your eyes, what would it 
take? "Armed men burst into your house and shoot you"?

*only half a wink*



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 13:47:07 +0300, Marko Rauhamaa wrote:

> Chris Angelico :
> 
>> On Thu, Jun 7, 2018 at 7:29 PM, Marko Rauhamaa 
>> wrote:
>>>   3. http://localhost:8000/te%00st.html
>>>
>>>  => The server crashes with a ValueError and the TCP connection is
>>> reset
>>>
>>>
>> Actually, I couldn't even get Chrome to make that request, so it
>> obviously was considered by the browser to be invalid.
> 
> Wow! Why on earth?

It works in Firefox, but Apache truncates the URL:


Not Found
The requested URL /te was not found on this server.


instead of te%00st.html

I wonder how many publicly facing web servers can be induced to either 
crash, or serve the wrong content, this way?



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: round

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 13:09:18 +0200, ast wrote:

> Hi
> 
> round is supposed to provide an integer when called without any
> precision argument.

True, but that's really under the control of the object you feed it to.

It would be possible for round() to enforce that, but it might break code 
that relies on having __round__ return a non-int.


[...]
> round(np.linalg.det(M))
> -18.0 # i was expecting an integer -18, not a float

Yes, that's surprising, but I'm not sure if that's a bug or not, and if 
it is a bug, whether it counts as a bug in round() or in numpy.

Most Python operators and functions that call dunder methods specify the 
*expected* but not *mandatory* behaviour. There are a few exceptions:

py> class BadClass:
... def __len__(self):
... return "Surprise!"
...
py> x = BadClass()
py> len(x)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'str' object cannot be interpreted as an integer


but generally it is up to the object itself to do the right thing, or 
else have a good reason not to.

My feeling here is that this *probably* counts as a bug in numpy, but I'm 
open to persuasion otherwise.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Chris Angelico
On Thu, Jun 7, 2018 at 10:18 PM, Steven D'Aprano
 wrote:
> On Thu, 07 Jun 2018 13:47:07 +0300, Marko Rauhamaa wrote:
>
>> Chris Angelico :
>>
>>> On Thu, Jun 7, 2018 at 7:29 PM, Marko Rauhamaa 
>>> wrote:
   3. http://localhost:8000/te%00st.html

  => The server crashes with a ValueError and the TCP connection is
 reset


>>> Actually, I couldn't even get Chrome to make that request, so it
>>> obviously was considered by the browser to be invalid.
>>
>> Wow! Why on earth?
>
> It works in Firefox, but Apache truncates the URL:
>
>
> Not Found
> The requested URL /te was not found on this server.
>
>
> instead of te%00st.html
>
> I wonder how many publicly facing web servers can be induced to either
> crash, or serve the wrong content, this way?
>

Define "serve the wrong content". You could get the exact same content
by asking for "te" instead of "te%00st.html"; what you've done is not
significantly different from this:

http://localhost:8000/te?st.html

Is that a security problem too?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 10:04:53 +0200, Antoon Pardon wrote:

> On 07-06-18 05:55, Steven D'Aprano wrote:
>> Python strings are rich objects which support the Unicode code point \0
>> in them. The limitation of the Linux kernel that it relies on NULL-
>> terminated byte strings is irrelevant to the question of what
>> os.path.exists ought to do when given a path containing NUL. Other
>> invalid path names return False.
> 
> It is not irrelevant. It makes the disctinction clear between possible
> values and impossible values. 

That is simply wrong. It is wrong in principle, and it is wrong in 
practice, for reasons already covered to death in this thread.

It is *wrong in practice* because other impossible values don't raise 
ValueError, they simply return False:

- illegal pathnames under Windows, those containing special 
  characters like ? > < * etc, simply return False;

- even on Linux, illegal pathnames like "" (the empty string)
  return False;

- invalid pathnames with too many path components, or too many
  characters in a single component, simply return False;

- the os.path.exists() function is not documented as making 
  a three-way split between "exists, doesn't exist and invalid";

- and it isn't even true to say that NULL is illegal in pathnames:
  there are at least five file systems that allow either NUL bytes:
  FAT-8, MFS, HFS, or Unicode \0 code points: HFS Plus and Apple
  File System.

And it is *wrong in principle* because in the most general case, there is 
no way to tell which pathnames are valid or invalid without querying an 
actual file system. In the case of Linux, any directory could be used as 
a mount point.

Is "/mnt/some?file" valid or invalid? If an NTFS file system is mounted 
on /mnt, it is invalid; if an ext4 file system is mounted there, it is 
valid; if there's nothing mounted there, the question is impossible to 
answer.


>> As a Python programmer, how does treating NUL specially make our life
>> better?
> 
> By treating possible path values differently from impossible path
> values.

But it doesn't do that. "Pathnames cannot contain NUL" is a falsehood 
that programmers wrongly believe about paths. HFS Plus and Apple File 
System support NULs in paths.

So what it does is wrongly single out one *POSSIBLE* path value to raise 
an exception, while other so-called "impossible" path values simply 
return False.

But in the spirit of compromise, okay, let's ignore the existence of file 
systems like HFS which allow NUL. Apart from Mac users, who uses them 
anyway? Let's pretend that every file system in existence, now and into 
the future, will prohibit NULs in paths.

Have you ever actually used this feature? When was the last time you 
wrote code like this?

try:
flag = os.path.exists(pathname)
except ValueError:
handle_null_in_path()
else:
if flag:
handle_file()
else:
handle_invalid_path_or_no_such_file()

I want to see actual, real code used in production, not made up code 
snippets, that demonstrate that this is a useful distinction to make.

Until such time that somebody shows me an actual real-world use-case for 
wanting to make this distinction for NULs and NULs alone, I call bullshit.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Chris Angelico
On Thu, Jun 7, 2018 at 10:13 PM, Steven D'Aprano
 wrote:
> On Thu, 07 Jun 2018 19:47:03 +1000, Chris Angelico wrote:
>
>> To be fair, it's somewhat unideal behaviour - I would prefer to see an
>> HTTP 500 come back if the server crashes - but I can't see that that's a
>> security problem.
>
> You think that being able to remotely crash a webserver isn't a security
> issue?
>
>
> If Denial Of Service isn't a security issue in your eyes, what would it
> take? "Armed men burst into your house and shoot you"?
>
> *only half a wink*
>

By "crash" I mean that the request handler popped out an exception.
The correct behaviour is to send back a 500 and go back to handling
requests; with the extremely simple server given in that example, it
fails to send back the 500, but it DOES go back to handling requests.
So it's not a DOS. In any real server environment, this wouldn't have
any significant impact; even in this trivially simple server, the only
way you could hurt the server is by spamming enough of these that it
runs out of file handles for sockets or something.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 22:46:09 +1000, Chris Angelico wrote:

>> I wonder how many publicly facing web servers can be induced to either
>> crash, or serve the wrong content, this way?
>>
>>
> Define "serve the wrong content". You could get the exact same content
> by asking for "te" instead of "te%00st.html"; 

Perhaps so, but maybe you can bypass access controls to te and get access 
to it even though it is supposed to be private.

This is a real vulnerability, called null-byte injection.

One component of the system sees a piece of input, truncates it at the 
NULL, and validates the truncated input; then another component acts on 
the untruncated (and unvalidated) input.

https://resources.infosecinstitute.com/null-byte-injection-php/

https://capec.mitre.org/data/definitions/52.html

Null-byte injection attacks have lead to remote attackers executing 
arbitrary code. That's unlikely in this scenario, but given that most web 
servers are written in C, not Python, it is conceivable that they could 
do anything under a null-byte injection attack.

Does the Python web server suffer from that vulnerability? I would be 
surprised if it were. But it can be induced to crash (an exception, not a 
seg fault) which is certainly a vulnerability.

Since people are unlikely to use this web server to serve mission 
critical public services over the internet, the severity is likely low. 
Nevertheless, it is still a real vulnerability.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Chris Angelico
On Thu, Jun 7, 2018 at 11:09 PM, Steven D'Aprano
 wrote:
> On Thu, 07 Jun 2018 22:46:09 +1000, Chris Angelico wrote:
>
>>> I wonder how many publicly facing web servers can be induced to either
>>> crash, or serve the wrong content, this way?
>>>
>>>
>> Define "serve the wrong content". You could get the exact same content
>> by asking for "te" instead of "te%00st.html";
>
> Perhaps so, but maybe you can bypass access controls to te and get access
> to it even though it is supposed to be private.
>
> This is a real vulnerability, called null-byte injection.
>
> One component of the system sees a piece of input, truncates it at the
> NULL, and validates the truncated input; then another component acts on
> the untruncated (and unvalidated) input.
>
> https://resources.infosecinstitute.com/null-byte-injection-php/
>
> https://capec.mitre.org/data/definitions/52.html
>
> Null-byte injection attacks have lead to remote attackers executing
> arbitrary code. That's unlikely in this scenario, but given that most web
> servers are written in C, not Python, it is conceivable that they could
> do anything under a null-byte injection attack.

Fair point. So you should just truncate early and have done with it. Easy.

> Does the Python web server suffer from that vulnerability? I would be
> surprised if it were. But it can be induced to crash (an exception, not a
> seg fault) which is certainly a vulnerability.

"Certainly"? I'm dubious on that. This isn't C, where a segfault
usually comes after executing duff memory, and therefore it's
plausible to transform a segfault into a remote code execution
exploit. This is Python, where we have EXCEPTION handling. Tell me, is
this a vulnerability?

@app.route("/foo")
def foo():
return "Kaboom", 500

What about this?

@app.route("/bar")
def bar():
1/0
return "Won't get here"

Put those into a Flask app and see what they do. One of them will
explicitly return a 500. The other will crash... and will return a
500. Is either of those a security problem? Now let's suppose a more
realistic version of the latter:

@app.route("/paginate/"):
def paginate(size):
total_pages = total_data/size
...

Yes, it's a bug. If someone tries a page size of zero, it'll divide by
zero and bomb. Great. But how is it a vulnerability? It is a
properly-handled exception.

It's slightly different with SimpleHTTPServer, as it fails to properly
send back the 500. That would be a bug IMO. Even then, though, all you
can do is clog the server with unfinished requests - and you can do
that much more easily by just connecting and being really slow to send
data. (And I doubt that people are using SimpleHTTPServer in
security-sensitive contexts anyway.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Python web server weirdness

2018-06-07 Thread Steven D'Aprano
I'm following the instructions here:

https://docs.python.org/3/library/http.server.html


and running this from the command line as a regular unprivileged user:

python3.5 -m http.server 8000

What I expected was a directory listing of my current directory.

What I got was Livejournal's front page.

W.T.F.???


-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Antoon Pardon
On 07-06-18 14:47, Steven D'Aprano wrote:
> On Thu, 07 Jun 2018 10:04:53 +0200, Antoon Pardon wrote:
>
>> On 07-06-18 05:55, Steven D'Aprano wrote:
>>> Python strings are rich objects which support the Unicode code point \0
>>> in them. The limitation of the Linux kernel that it relies on NULL-
>>> terminated byte strings is irrelevant to the question of what
>>> os.path.exists ought to do when given a path containing NUL. Other
>>> invalid path names return False.
>> It is not irrelevant. It makes the disctinction clear between possible
>> values and impossible values. 
> That is simply wrong. It is wrong in principle, and it is wrong in 
> practice, for reasons already covered to death in this thread.
>
> It is *wrong in practice* because other impossible values don't raise 
> ValueError, they simply return False:
>
> - illegal pathnames under Windows, those containing special 
>   characters like ? > < * etc, simply return False;
>
> - even on Linux, illegal pathnames like "" (the empty string)
>   return False;
>
> - invalid pathnames with too many path components, or too many
>   characters in a single component, simply return False;
>
> - the os.path.exists() function is not documented as making 
>   a three-way split between "exists, doesn't exist and invalid";

So? Maybe we should reconsider the above behaviour?

>
> - and it isn't even true to say that NULL is illegal in pathnames:
>   there are at least five file systems that allow either NUL bytes:
>   FAT-8, MFS, HFS, or Unicode \0 code points: HFS Plus and Apple
>   File System.

That doesn't matter much. sqrt(-1) gives a ValueError, while there
are numberdomains for which it has a value.


> And it is *wrong in principle* because in the most general case, there is 
> no way to tell which pathnames are valid or invalid without querying an 
> actual file system. In the case of Linux, any directory could be used as 
> a mount point.

I don't see how your first statement follows from that explanation. I don't
have a problem with needing to query the actual file system in order to find
out which pathnames are valid or invalid.

> Have you ever actually used this feature? When was the last time you?

This is irrelevant. You are now trying to argue the uselesness. The fact that
after consideration something turns out not very useful, is not a reason
to conclude that the factors that were taken into consideration were irrelevant.

Personaly I don't use os.path.exists because it tries to shoe horn too many
possibilities into a boolean result. Do you think os.stat("\0") should
raise FileNotFoundError?

-- 
Antoon.


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python web server weirdness

2018-06-07 Thread Grant Edwards
On 2018-06-07, Steven D'Aprano  wrote:
> I'm following the instructions here:
>
> https://docs.python.org/3/library/http.server.html
>
>
> and running this from the command line as a regular unprivileged user:
>
> python3.5 -m http.server 8000
>
> What I expected was a directory listing of my current directory.
>
> What I got was Livejournal's front page.

Looking into the crystal ball and guessing that "got" means you
pointed a browser at "http://localhost:8000/";...

Do you have a file named "index.html" in your home directory?

https://docs.python.org/3/library/http.server.html

  If the request was mapped to a directory, the directory is checked
  for a file named index.html or index.htm (in that order). If found,
  the file’s contents are returned;

-- 
Grant Edwards   grant.b.edwardsYow! I want another
  at   RE-WRITE on my CEASAR
  gmail.comSALAD!!

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python web server weirdness

2018-06-07 Thread Grant Edwards
On 2018-06-07, Steven D'Aprano  wrote:
> I'm following the instructions here:
>
> https://docs.python.org/3/library/http.server.html
>
>
> and running this from the command line as a regular unprivileged user:
>
> python3.5 -m http.server 8000
>
> What I expected was a directory listing of my current directory.
>
> What I got was Livejournal's front page.

That's very odd. What I get is the message below:

$ python3.5 -m http.server 8000
Serving HTTP on 0.0.0.0 port 8000 ...

-- 
Grant Edwards   grant.b.edwardsYow! I invented skydiving
  at   in 1989!
  gmail.com

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Tim Chase
On 2018-06-07 22:46, Chris Angelico wrote:
> On Thu, Jun 7, 2018 at 10:18 PM, Steven D'Aprano
>    3. http://localhost:8000/te%00st.html
> >>> Actually, I couldn't even get Chrome to make that request, so it
> >>> obviously was considered by the browser to be invalid.  

It doesn't matter whether Chrome or Firefox can make the request if
it can be made by opening the socket yourself with something as
simple as

  $ telnet example.com 80
  GET /te%00st.html HTTP/1.1
  Host: example.com

If that crashes the server, it's a problem, even if browsers try to
prevent it from happening by accident.

>> It works in Firefox, but Apache truncates the URL:
>>
>> Not Found
>> The requested URL /te was not found on this server.
>>
>> instead of te%00st.html

This is a sensible result, left up to each server to decide what to
do.

>> I wonder how many publicly facing web servers can be induced to
>> either crash, or serve the wrong content, this way?

I'm sure there are plenty. I mean, I discovered this a while back

https://mail.python.org/pipermail/python-list/2016-August/713373.html

and that's Microsoft running their own stack.  They seem to have
fixed that issue at that particular set of URLs, but a little probing
has turned it up elsewhere at microsoft.com since (for the record,
the first set of non-existent URLs return 404-not-found errors while
the second set of reserved filename URLs return
500-Server-Internal-Error pages).  Filename processing is full of
sharp edge-cases.

> Define "serve the wrong content". You could get the exact same
> content by asking for "te" instead of "te%00st.html"; what you've
> done is not significantly different from this:
> 
> http://localhost:8000/te?st.html
> 
> Is that a security problem too?

Depending on the server, it might allow injection for something like

 http://example.com/page%00cat+/etc/passwd

Or it might allow the request to be processed in an attack, but leave
the log files without the details:

 GET /innocent%00malicious_payload
 (where only the "/innocent" gets logged)

Or false data could get injected in log files

 
http://example.com/innocent%00%0a23.200.89.180+-+-+%5b07/Jun/2018%3a13%3a55%3a36+-0700%5d+%22GET+/nasty_porn.mov+HTTP/1.0%22+200+2326

(`host whitehouse.gov` = 23.200.89.180)

It all depends on the server and how the request is handled.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python web server weirdness SOLVED

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 13:32:10 +, Steven D'Aprano wrote:

[...]
> python3.5 -m http.server 8000
> 
> What I expected was a directory listing of my current directory.
> 
> What I got was Livejournal's front page.

Never mind -- it turned out I had an "index.html" file in the directory 
which had been wget'ed from LiveJournal. When I deleted that, it worked 
as expected.




-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python web server weirdness

2018-06-07 Thread Ed Kellett
On 2018-06-07 14:32, Steven D'Aprano wrote:
> I'm following the instructions here:
> 
> https://docs.python.org/3/library/http.server.html
> 
> 
> and running this from the command line as a regular unprivileged user:
> 
> python3.5 -m http.server 8000
> 
> What I expected was a directory listing of my current directory.
> 
> What I got was Livejournal's front page.
> 
> W.T.F.???
> 
> 

Do you have LiveJournal's index.html in your current directory?



signature.asc
Description: OpenPGP digital signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python web server weirdness

2018-06-07 Thread Tim Chase
On 2018-06-07 13:32, Steven D'Aprano wrote:
> I'm following the instructions here:
> 
> https://docs.python.org/3/library/http.server.html
> 
> and running this from the command line as a regular unprivileged
> user:
> 
> python3.5 -m http.server 8000
> 
> What I expected was a directory listing of my current directory.
> 
> What I got was Livejournal's front page.

A couple things to check:

1) you don't mention which URL you pointed your browser at.  I
*presume* it was http://localhost:8000 but without confirmation, it's
hard to tell.  Also, you don't mention if you had anything in the
{path} portion of the URL such as
"http://localhost:8000/livejournal_homepage.html";

2) you don't mention whether your command succeeded with "Serving
HTTP on 0.0.0.0 port 8000" or if it failed because perhaps something
else was listening on that port ("OSError: [Errno 98] Address already
in use").

3) when your browser made the request to that localhost URL, did that
command produce output logging the incoming requests?

4) do you have any funky redirection for localhost in your /etc/hosts
file (or corresponding file location on Windows)

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python web server weirdness

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 13:32:10 +, Steven D'Aprano wrote:

> python3.5 -m http.server 8000
[...]


Thank you to everyone who responded, pointing out that I should check for 
an index.html file. That was exactly the problem.

And yes, I acknowledge that my original post was lacking in some 
necessary debugging detail. Mea culpa.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem finding my folder via terminal

2018-06-07 Thread T Berger
On Wednesday, June 6, 2018 at 12:19:35 PM UTC-4, T Berger wrote:
> I’m learning Python on my own and have been stuck for two days trying to get 
> modules I created into site-packages. As a trial step, we were asked to 
> change directly into the folder containing our modules. I typed “cd 
> mymodules” per instructions, but got this error message: “-bash: cd: 
> mymodules: No such file or directory.” I saved mymodules to my documents. 
> What is going wrong here?
> 
> When I tried to create a distribution file, I typed “192:~ TamaraB$ 
> mymodules$ python3 setup.py sdist.” I got this error message: “-bash: 
> mymodules$: command not found.” What should I do?

~
To answer your questions in order:
  “Who asked you to do that?”
I’m teaching myself python with the book, Head First Python. In the exercise 
I’m having trouble with, we’re supposed to install a module we created into 
site-packages. 
 “We'll need some more information about the computer you are using: what OS 
are you using (Mac, Linux, Windows, something else), what shell are you using, 
perhaps a file listing of your home directory. “
I’m using Terminal in Mac Sierra (10.12.6).
“(I'm not sure what the 192 part means. Does that increase each time you type a 
command?) “
I new to Terminal, but that 192 looked weird to me too. It doesn’t increase, 
just stays at 192. There is also a thin gray left bracket in front of the “192” 
which didn’t copy into my email. Is there some way to restore the default 
prompt in Terminal? What is the default prompt?
Back to my problem. Your email helped me get into the mymodules folder, but I’m 
still stuck at the next step of the exercise I’m working on, which is to 
getting the module I created into site-packages. The folder, mymodules, 
contains three files: the module we created, a setup file (setup.py), and a 
readme file. The line of text we were instructed to type into our terminal was: 
“python3 setup.py sdist.” In response, I got this error message: 
“/Library/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/Contents/MacOS/Python:
 can't open file 'setup.py': [Errno 2] No such file or directory”.  What is 
going wrong here? 
Thanks to any replies
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem finding my folder via terminal

2018-06-07 Thread T Berger
On Wednesday, June 6, 2018 at 12:19:35 PM UTC-4, T Berger wrote:
> I’m learning Python on my own and have been stuck for two days trying to get 
> modules I created into site-packages. As a trial step, we were asked to 
> change directly into the folder containing our modules. I typed “cd 
> mymodules” per instructions, but got this error message: “-bash: cd: 
> mymodules: No such file or directory.” I saved mymodules to my documents. 
> What is going wrong here?
> 
> When I tried to create a distribution file, I typed “192:~ TamaraB$ 
> mymodules$ python3 setup.py sdist.” I got this error message: “-bash: 
> mymodules$: command not found.” What should I do?


To answer your questions in order: 

  “Who asked you to do that?” 

I’m teaching myself python with the book, Head First Python. In the exercise 
I’m having trouble with, we’re supposed to install a module we created into 
site-packages. 

 “We'll need some more information about the computer you are using: what OS 
are you using (Mac, Linux, Windows, something else), what shell are you using, 
perhaps a file listing of your home directory. “ 

I’m using Terminal in Mac Sierra (10.12.6). 

“(I'm not sure what the 192 part means. Does that increase each time you type a 
command?) “ 

I'm new to Terminal, but that 192 looked weird to me too. It doesn’t increase, 
just stays at 192. There is also a thin gray left bracket in front of the “192” 
which didn’t copy into my email. Is there some way to restore the default 
prompt in Terminal (and what is the default prompt)? 

Back to my problem. Your email helped me get into the mymodules folder, but I’m 
still stuck at the next step of the exercise, which is to get the module I 
created into site-packages. mymodules contains three files: the module we 
created, a setup file (setup.py), and a readme file. The line of text we were 
instructed to type into our terminal was: “python3 setup.py sdist.” In 
response, I got this error message: 
“/Library/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/Contents/MacOS/Python:
 can't open file 'setup.py': [Errno 2] No such file or directory”.  

Why is this not working for me?

Thanks to any replies
-- 
https://mail.python.org/mailman/listinfo/python-list


FULLSCREEN and DOUBLEBUF

2018-06-07 Thread Paul St George
This is both a narrow question about some code and a more general 
question about syntax in Python


Using the Pygame modules, I want to set both FULLSCREEN and DOUBLEBUF

I can use
screen = 
pygame.display.set_mode((screen_width,screen_height),pygame.FULLSCREEN)

to set a FULLSCREEN display

Or, I can use
screen = 
pygame.display.set_mode((screen_width,screen_height),pygame.DOUBLEBUF)

to set DOUBLEBUF

But how do I set both FULLSCREEN and DOUBLEBUF?

And, how can I test or check that DOUBLEBUF is set?

--
Paul St George
http://www.paulstgeorge.com
http://www.devices-of-wonder.com

--
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread MRAB

On 2018-06-07 08:45, Chris Angelico wrote:

On Thu, Jun 7, 2018 at 1:55 PM, Steven D'Aprano
 wrote:

On Tue, 05 Jun 2018 23:27:16 +1000, Chris Angelico wrote:


And an ASCIIZ string cannot contain a byte value of zero. The parallel
is exact.


Why should we, as Python programmers, care one whit about ASCIIZ strings?
They're not relevant. You might as well say that file names cannot
contain the character "π" because ASCIIZ strings don't support it.

No they don't, and yet nevertheless file names can and do contain
characters outside of the ASCIIZ range.


Under Linux, a file name contains bytes, most commonly representing
UTF-8 sequences. So... an ASCIIZ string *can* contain that character,
or at least a representation of it. Yet it cannot contain "\0".

I've seen a variation of UTF-8 that encodes U+ as 2 bytes so that a 
zero byte can be used as a terminator.


It's therefore not impossible to have a version of Linux that allowed a 
(Unicode) "\0" in a filename.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Chris Angelico
On Fri, Jun 8, 2018 at 3:10 AM, MRAB  wrote:
> On 2018-06-07 08:45, Chris Angelico wrote:
>> Under Linux, a file name contains bytes, most commonly representing
>> UTF-8 sequences. So... an ASCIIZ string *can* contain that character,
>> or at least a representation of it. Yet it cannot contain "\0".
>>
> I've seen a variation of UTF-8 that encodes U+ as 2 bytes so that a zero
> byte can be used as a terminator.
>
> It's therefore not impossible to have a version of Linux that allowed a
> (Unicode) "\0" in a filename.

Considering that Linux treats filenames as raw bytes, that's not
surprising. The mangled encoding you refer to is a horrendous cheat,
though, and violates several of the design principles of UTF-8, so I
do not recommend it EVER. The correct way for Python to handle and
represent such a file name would be to use the U+DCxx range to carry
the bytes through unchanged - not using "\0".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: FULLSCREEN and DOUBLEBUF

2018-06-07 Thread Chris Angelico
On Fri, Jun 8, 2018 at 3:12 AM, Paul St George  wrote:
> This is both a narrow question about some code and a more general question
> about syntax in Python
>
> Using the Pygame modules, I want to set both FULLSCREEN and DOUBLEBUF
>
> I can use
> screen =
> pygame.display.set_mode((screen_width,screen_height),pygame.FULLSCREEN)
> to set a FULLSCREEN display
>
> Or, I can use
> screen =
> pygame.display.set_mode((screen_width,screen_height),pygame.DOUBLEBUF)
> to set DOUBLEBUF
>
> But how do I set both FULLSCREEN and DOUBLEBUF?
>
> And, how can I test or check that DOUBLEBUF is set?

This is definitely a pygame question. So let's grab the docos for that
set_mode function.

https://www.pygame.org/docs/ref/display.html#pygame.display.set_mode

You're passing two parameters in each of your examples. The first is a
tuple of (w,h) for the dimensions; the second is a constant for the
mode you want. The name of the second argument is "flags", according
to the documentation. That usually means that you can provide
multiple. The exact mechanism for combining flags is given in the last
paragraph of the docs. I'll let you take the credit for figuring out
the details yourself :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


logging with multiprocessing

2018-06-07 Thread jenil . desai25
Hello,

I am new to logging module. I want to use logging module with multiprocessing. 
can anyone help me understand how can I do it?. Any help would be appreciated.

Thank you.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: FULLSCREEN and DOUBLEBUF

2018-06-07 Thread Mark Lawrence

On 07/06/18 18:12, Paul St George wrote:
This is both a narrow question about some code and a more general 
question about syntax in Python


Using the Pygame modules, I want to set both FULLSCREEN and DOUBLEBUF

I can use
screen = 
pygame.display.set_mode((screen_width,screen_height),pygame.FULLSCREEN)

to set a FULLSCREEN display

Or, I can use
screen = 
pygame.display.set_mode((screen_width,screen_height),pygame.DOUBLEBUF)

to set DOUBLEBUF

But how do I set both FULLSCREEN and DOUBLEBUF?

And, how can I test or check that DOUBLEBUF is set?



Pure guesswork but how about:-

screen = pygame.display.set_mode((screen_width,screen_height), 
pygame.FULLSCREEN | pygame.DOUBLEBUF)


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Stefan's headers [was:Names and identifiers]

2018-06-07 Thread Peter Pearson
On Thu, 7 Jun 2018 01:23:31 + (UTC), Steven D'Aprano wrote:
> Disclaimer: Ido not see Stefan's original post. I recall that he has set 
> some sort of header on his posts which means they are not processed by 
> Gmane, but unfortunately I no longer have any of his posts in my cache 
> where I can check.
>
> If anyone else is getting Stefan's posts, can you inspect the full 
> headers and see if there is a relevant header?

Here's the full header, as received by slrn from news.individual.net:

Path: uni-berlin.de!not-for-mail
From: r...@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.lang.python
Subject: Names and identifiers
Date: 6 Jun 2018 18:37:46 GMT
Organization: Stefan Ram
Lines: 26
Expires: 1 Aug 2018 11:59:58 GMT
Message-ID: 
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de y/fWNNOsUiR8l3NiIdt8QQKv4EmIpqY+4EFRjM4L9WcmK0
X-Copyright: (C) Copyright 2018 Stefan Ram. All rights reserved. Distribution 
through any means
  other than regular usenet channels is forbidden. It is forbidden to publish 
this article in the
  Web, to change URIs of this article into links,and to transfer the 
body without this
  notice, but quotationsof parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is only set, because this prevents some 
services to mirror the
  article via the web (HTTP). But Stefan Ram hereby allows to keep this article 
within a Usenet
  archive serverwith only NNTP access without any time limitation.
X-No-Html: yes
Content-Language: en
Xref: uni-berlin.de comp.lang.python:794657
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Sorting NaNs

2018-06-07 Thread Peter Pearson
On Thu, 07 Jun 2018 19:02:42 +1200, Gregory Ewing wrote:
> Steven D'Aprano wrote:
>> But if it were (let's say) 1 ULP greater or less 
>> than one half, would we even know?
>
> In practice it's probably somewhat bigger than 1 ULP.
> A typical PRNG will first generate a 32-bit integer and
> then map it to a float, giving a resolution coarser than
> the 52 bits of an IEEE double.
>
> But even then, the probability of getting exactly 0.5
> is only 1/2^32, which you're not likely to notice.

But gosh, if there are only 2**32 different "random" floats, then
you'd have about a 50% chance of finding a collision among any
set of 2**16 samples.  Is that really tolerable?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Stefan's headers [was:Names and identifiers]

2018-06-07 Thread Chris Angelico
On Fri, Jun 8, 2018 at 6:36 AM, Peter Pearson  wrote:
> Here's the full header, as received by slrn from news.individual.net:
>
> X-Copyright: (C) Copyright 2018 Stefan Ram. All rights reserved. Distribution 
> through any means
>   other than regular usenet channels is forbidden. It is forbidden to publish 
> this article in the
>   Web, to change URIs of this article into links,and to transfer the 
> body without this
>   notice, but quotationsof parts in other Usenet posts are allowed.
> X-No-Archive: Yes
> Archive: no
> X-No-Archive-Readme: "X-No-Archive" is only set, because this prevents some 
> services to mirror the
>   article via the web (HTTP). But Stefan Ram hereby allows to keep this 
> article within a Usenet
>   archive serverwith only NNTP access without any time limitation.

Yeah, if I were a sysadmin carrying this kind of traffic, I'd just
block all those posts rather than risk any sort of legal liability.
Not worth any sort of risk. A simple ban is easy and effective, and
fully compliant with the copyright notice.

(I don't understand this paranoia about HTTP, frankly.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Valid encodings for a Python source file

2018-06-07 Thread Daniel Glus
I'm trying to figure out the entire list of possible encodings for a Python
source file - that is, encodings that can go in a PEP 263
 encoding specification, like #
-*- encoding: foo -*-.

Is this list the same as the list given in the documentation for the codecs
library, under "Standard Encodings"
? If
not, where can I find the actual list?

(I know that list is the same as the set of unique values in CPython's
/Lib/encodings/aliases.py
,
or equivalently, the set of filenames in /Lib/encodings/
, but again
I'm not sure.)
-Daniel
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Sorting NaNs

2018-06-07 Thread Michael Lamparski
On Thu, Jun 7, 2018 at 4:43 PM, Peter Pearson 
wrote:

> But gosh, if there are only 2**32 different "random" floats, then
> you'd have about a 50% chance of finding a collision among any
> set of 2**16 samples.  Is that really tolerable?
> --
> https://mail.python.org/mailman/listinfo/python-list
>

In any case, it's verifiably not true for CPython.

> >>> def birthday(b):
> ...   rand = random.random
> ...   xs = [rand() for _ in range(2**b)]
> ...   return len(xs) - len(set(xs))
> ...
> >>> birthday(24)
> 0
> >>> [birthday(24) for _ in range(10)]
> [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Michael
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 15:38:39 -0400, Dennis Lee Bieber wrote:

> On Fri, 1 Jun 2018 23:16:32 + (UTC), Steven D'Aprano
>  declaimed the following:
> 
>>It should either return False, or raise TypeError. Of the two, since
>>3.14159 cannot represent a file on any known OS, TypeError would be more
>>appropriate.
>>
>   I wouldn't be so sure of that...

I would.

There is no existing file system which uses floats instead of byte- or 
character-strings for file names. If you believe different, please name 
the file


> Xerox CP/V allowed for embedding
> non-printable characters into file names

Just like most modern file systems.

Even FAT-16 supports a range of non-ASCII bytes with the high-bit set 
(although not the control codes with the high-bit cleared). Unix file 
systems typically support any byte except \0 and /. Most modern file 
systems outside of Unix support any Unicode character (or almost any) 
including ASCII control characters.

https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits



[...]
>   With some work, one could probably generate a file name 
containing the
> bytes used for storing a floating point value.

Any collection of bytes can be interpreted as any thing we like. 
(Possibly requiring padding or splitting to fit fixed-width data 
structures.) Sounds. Bitmaps. Coordinates in three dimension space. 
Floating point numbers is no challenge. A Python float is represented by 
an eight-byte C double. Provided we agree on a convention for splitting 
byte strings into eight-byte chunks, adding padding, and agree on big- or 
little-endianness, it is trivial to convert file names to one or more 
floats:

/etc is equivalent to 2.2617901550715974e-80

(big endian, padding added to the right)

But just because I can do that conversion, doesn't mean that the file 
system uses floats for file names.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Django-hotsauce/ZODB 5.4.0/PyPy nightly sprint!!

2018-06-07 Thread Etienne Robillard
Yo people I'm doing a nightly hacking sprint for django-hotsauce on pypy 
and got some cool bugs I would like to share:


Traceback (most recent call last):
  File "/usr/local/bin/schevo", line 11, in 
    load_entry_point('libschevo', 'console_scripts', 'schevo')()
  File "/home/erob/src/libschevo/lib/schevo/script/command.py", line 
32, in __call__

    return self.main(arg0, args)
  File "/home/erob/src/libschevo/lib/schevo/script/command.py", line 
74, in main

    return command()(*args)
  File "/home/erob/src/libschevo/lib/schevo/script/command.py", line 
32, in __call__

    return self.main(arg0, args)
  File "/home/erob/src/libschevo/lib/schevo/script/command.py", line 
74, in main

    return command()(*args)
  File "/home/erob/src/libschevo/lib/schevo/script/command.py", line 
32, in __call__

    return self.main(arg0, args)
  File "/home/erob/src/libschevo/lib/schevo/script/db_evolve.py", line 
86, in main

    db = schevo.database.open(url)
  File "/home/erob/src/libschevo/lib/schevo/database.py", line 371, in open
    db = Database(backend)
  File "/home/erob/src/libschevo/lib/schevo/database2.py", line 95, in 
__init__

    self._update_extent_maps_by_name()
  File "/home/erob/src/libschevo/lib/schevo/database2.py", line 1633, 
in _update_extent_maps_by_name

    for extent in self._extent_maps_by_id.itervalues():
  File "/usr/local/lib/python2.7/dist-packages/ZODB/Connection.py", 
line 791, in setstate

    p, serial = self._storage.load(oid)
  File "/usr/local/lib/python2.7/dist-packages/ZODB/mvccadapter.py", 
line 143, in load

    r = self._storage.loadBefore(oid, self._start)
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/ClientStorage.py", line 520, 
in loadBefore

    return self._server.load_before(oid, tid)
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/asyncio/client.py", line 783, 
in load_before

    return self.__call(self.client.load_before_threadsafe, oid, tid)
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/asyncio/client.py", line 748, 
in call

    return self.wait_for_result(result, self.timeout)
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/asyncio/client.py", line 756, 
in wait_for_result

    return future.result(timeout)
  File 
"/usr/local/lib/python2.7/dist-packages/futures-3.0.5-py2.7.egg/concurrent/futures/_base.py", 
line 405, in result

    return self.__get_result()
  File 
"/usr/local/lib/python2.7/dist-packages/futures-3.0.5-py2.7.egg/concurrent/futures/_base.py", 
line 357, in __get_result

    raise type(self._exception), self._exception, self._traceback
ZEO.Exceptions.ClientDisconnected: connection lost
erob@marina:/home/www/isotopesoftware.ca/trunk$


Not sure about this first one! :)

The command I'm trying to run is:

% schevo db evolve --app blogengine2 zodb://127.0.0.1:4545 31

The ZODB 5.4.0 server then produce the following traceback:

2018-06-07T21:14:55 INFO ZEO.asyncio.base Connected server protocol
--
2018-06-07T21:14:55 INFO ZEO.asyncio.server received handshake 'Z5'
--
2018-06-07T21:14:55 ERROR ZEO.asyncio.marshal can't decode message: 
'((ccopy_reg\n_reconstructor\n(czodbpickle\nbinary\nc__b...'

--
2018-06-07T21:14:55 ERROR ZEO.asyncio.server Can't deserialize message
Traceback (most recent call last):
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/asyncio/server.py", line 89, 
in message_received

    message_id, async, name, args = self.decode(message)
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/asyncio/marshal.py", line 
114, in pickle_server_decode

    return unpickler.load() # msgid, flags, name, args
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/asyncio/marshal.py", line 
164, in server_find_global

    raise ImportError("import error %s: %s" % (module, msg))
ImportError: import error copy_reg:
--
2018-06-07T21:14:55 ERROR ZEO.asyncio.base data_received 4 0 True
Traceback (most recent call last):
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/asyncio/base.py", line 128, 
in data_received

    self.message_received(collected)
  File "/home/erob/work/ZEO-5.1.0/src/ZEO/asyncio/server.py", line 94, 
in message_received

    if message_id == -1:
UnboundLocalError: local variable 'message_id' referenced before assignment
--
2018-06-07T21:14:55 INFO ZEO.StorageServer (127.0.0.1:4545) disconnected
--
2018-06-07T21:14:55 INFO ZEO.asyncio.base Connected server protocol
--
2018-06-07T21:14:55 INFO ZEO.asyncio.server received handshake 'Z5'
--
2018-06-07T21:14:55 INFO ZEO.StorageServer (127.0.0.1:4545) disconnected

Please hit me up if you know how to fix theses errors! :)

I'm using PyPy 5.9 and 5.10 for dev and Python 2.7.13 for production 
with Cython bindings!



Cheers,

Etienne



--
https://mail.python.org/mailman/listinfo/python-list


Re: Sorting NaNs

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 20:43:10 +, Peter Pearson wrote:

> On Thu, 07 Jun 2018 19:02:42 +1200, Gregory Ewing wrote:
>> Steven D'Aprano wrote:
>>> But if it were (let's say) 1 ULP greater or less than one half, would
>>> we even know?
>>
>> In practice it's probably somewhat bigger than 1 ULP. A typical PRNG
>> will first generate a 32-bit integer and then map it to a float, giving
>> a resolution coarser than the 52 bits of an IEEE double.
>>
>> But even then, the probability of getting exactly 0.5 is only 1/2^32,
>> which you're not likely to notice.
> 
> But gosh, if there are only 2**32 different "random" floats, then you'd
> have about a 50% chance of finding a collision among any set of 2**16
> samples.  Is that really tolerable?

Why wouldn't it be? It would be shocking if a sufficiently large sequence 
of numbers contained no collisions at all: that would imply the values 
were very much NON random.

If you truly were limited to 2**32 different values (we're not), then it 
would be exactly right and proper to expect a collision in 2**16 samples. 
Actually, a lot less than that: more like 78000.

https://en.wikipedia.org/wiki/Birthday_problem

(For 2**60 values, we'd need about 1.2 billion samples.)

Greg's statement about "a typical PRNG" may or may not be true, depending 
on what counts as "typical". The standard C language PRNG is notoriously 
awful, with a tiny period and lots and lots of correlations between 
values. Its barely random-ish.

But like many other languages, Python uses a more modern random number 
generator, the Mersenne Twister, which passes a battery of statistical 
tests for randomness (including DieHard) and has a very long period of 
2**19937 - 1. I understand that Python's Mersenne Twister implementation 
is based on 64-bit ints.

https://en.wikipedia.org/wiki/Mersenne_Twister




-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem finding my folder via terminal

2018-06-07 Thread Cameron Simpson

Hi,

Replies inline below, which is the style we prefer on this list. (And to reply, 
please reply to the specific message, not your original post. This will let you 
pick up that branch of the conversation directly and not confuse your readers.)


On 07Jun2018 08:39, T Berger  wrote:

On Wednesday, June 6, 2018 at 12:19:35 PM UTC-4, T Berger wrote:

I’m learning Python on my own and have been stuck for two days trying to get 
modules I created into site-packages. As a trial step, we were asked to change 
directly into the folder containing our modules. I typed “cd mymodules” per 
instructions, but got this error message: “-bash: cd: mymodules: No such file 
or directory.” I saved mymodules to my documents. What is going wrong here?

When I tried to create a distribution file, I typed “192:~ TamaraB$ mymodules$ 
python3 setup.py sdist.” I got this error message: “-bash: mymodules$: command 
not found.” What should I do?

[...snip...]
“We'll need some more information about the computer you are using: what OS 
are you using (Mac, Linux, Windows, something else), what shell are you 
using, perhaps a file listing of your home directory. “


I’m using Terminal in Mac Sierra (10.12.6).


Cool.


“(I'm not sure what the 192 part means. Does that increase each time you type a 
command?) “

I'm new to Terminal, but that 192 looked weird to me too. It doesn’t increase, 
just stays at 192. There is also a thin gray left bracket in front of the “192” 
which didn’t copy into my email. Is there some way to restore the default 
prompt in Terminal (and what is the default prompt)?


On a Mac, it tends to be like this: "{hostname}:~ {username}$ " where 
{hostname} is your Mac's name and {username} is your login name; that is called 
the "shell prompt", and "the shell" is the command line interpreter running the 
commands you type. On a Mac, this is usually bash, a UNIX Bourne shell.


There is a secondary prompt like this "> ". That indicates that you're typing a 
compond command, or at least that the shell believes you're typing a compond 
command, which is just a command which extends to more than one line. The 
common way to confuse the shell about this is to forget to close a quote - the 
shell expects that string to continue until it sees a closing quote.


You can leave the secondary prompt by typing Control-C (often denoted "^C").  
That will cancel the incomplete command and get you back to a clean empty 
primary prompt.


Note that if you start some interactive command, such as the interactive Python 
interpreter, you will then be dealing with _its_ prompts until you leave that 
command.



Back to my problem. Your email helped me get into the mymodules folder, but I’m 
still stuck at the next step of the exercise, which is to get the module I 
created into site-packages. mymodules contains three files: the module we 
created, a setup file (setup.py), and a readme file. The line of text we were 
instructed to type into our terminal was: “python3 setup.py sdist.” In 
response, I got this error message: 
“/Library/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/Contents/MacOS/Python:
 can't open file 'setup.py': [Errno 2] No such file or directory”.

Why is this not working for me?


I would expect that your shell is not actually in the "mymodules" directory 
when you typed "python3 setup.py sdist". Usually your shell prompt includes the 
current working directory (the "~" in my example above, which is your home 
directory), which is a useful contextual clue.


You can also find out your current working directory by running the "pwd" 
command (the "print working directory" command).


The "ls" (list) command without arguments will list what is in the current 
directory, so you can now check (a) whether you're where you thought you were, 
and (b) what is in the current directory (in case it doesn't contain what you 
expected).


The "ls -la" command will provide a longer and more detailed listing too.

Let us know what you find out.

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 23:25:54 +1000, Chris Angelico wrote:

[...]
>> Does the Python web server suffer from that vulnerability? I would be
>> surprised if it were. But it can be induced to crash (an exception, not
>> a seg fault) which is certainly a vulnerability.
> 
> "Certainly"? I'm dubious on that. This isn't C, where a segfault usually
> comes after executing duff memory, and therefore it's plausible to
> transform a segfault into a remote code execution exploit.

I just said that I would be surprised if you could get remote code 
execution from the Python web server, for exactly the reason you state: 
its an exception, not a segfault.

Stop agreeing with me when we're trying to have an argument! *wink*


[...]
> Yes, it's a bug. If someone tries a page size of zero, it'll divide by
> zero and bomb. Great. But how is it a vulnerability? It is a
> properly-handled exception.

Causing a denial of service is a vulnerability.

Security vulnerabilities are not just about remote code execution. Can 
remote attackers bring your service down? If so, you are vulnerable to 
having remote attackers bring your service down.

Can remote attackers overwhelm your server with so many errors that they 
fill your disks with error logs and either stop logging, or crash? Then 
you are vulnerable to having remote attackers crash your server, or hide 
their tracks by preventing logging.

Can remote attackers induce your server to serve files it shouldn't? Then 
you are vulnerable to attacks that leak sensitive or private information.

There's far more to security vulnerabilities than just "oh well, they 
can't get a shell or execute code on my server, so it's all cool" *wink*


In this specific case:

> It's slightly different with SimpleHTTPServer, as it fails to properly
> send back the 500. That would be a bug IMO. 

There seems to be some weird interaction occurring on my system between 
the SimpleHTTPServer, Firefox, and my web proxy, so I may have 
misinterpreted the precise nature of the crash. What I initially saw was 
that allow the SimpleHTTPServer remained running, it stopped responding 
to requests and Firefox would repeatedly respond:

Firefox can't find the server at www.localhost.com

even though the process was still running. But when I tried with a 
different browser (links), I don't get that same behaviour. links is 
using the web proxy, Firefox isn't, but I'm not quite sure why that makes 
a difference.

> Even then, though, all you
> can do is clog the server with unfinished requests - and you can do that
> much more easily by just connecting and being really slow to send data.
> (And I doubt that people are using SimpleHTTPServer in
> security-sensitive contexts anyway.)

Again, you're just repeating what I said in different words. I already 
said that *this specific* issue is probably low severity, because people 
are unlikely to use SimpleHTTPServer for mission critical services 
exposed to the internet. 



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 17:45:06 +1000, Chris Angelico wrote:

> On Thu, Jun 7, 2018 at 1:55 PM, Steven D'Aprano
>  wrote:
>> On Tue, 05 Jun 2018 23:27:16 +1000, Chris Angelico wrote:
>>
>>> And an ASCIIZ string cannot contain a byte value of zero. The parallel
>>> is exact.
>>
>> Why should we, as Python programmers, care one whit about ASCIIZ
>> strings? They're not relevant. You might as well say that file names
>> cannot contain the character "π" because ASCIIZ strings don't support
>> it.
>>
>> No they don't, and yet nevertheless file names can and do contain
>> characters outside of the ASCIIZ range.
> 
> Under Linux, a file name contains bytes, most commonly representing
> UTF-8 sequences.

The fact that user-space applications like the shell and GUI file 
managers sometimes treat file names at UTF-8 Unicode is not really 
relevant to what the file system allows. The most common Linux file 
systems are fundamentally bytes, not Unicode characters, and while I'm 
willing to agree to call the byte 0x41 "A", there simply is no such byte 
that means "π" or U+10902 PHOENICIAN LETTER GAML.

File names under typical Linux file systems are not necessarily valid 
UTF-8 Unicode. That's why Python still provides a bytes-interface as well 
as a text interface.


> So... an ASCIIZ string *can* contain that character, or
> at least a representation of it. Yet it cannot contain "\0".

You keep saying that as if it made one whit of difference to what 
os.path.exists should do. I completely agree that ASCIIZ strings cannot 
contain NUL bytes. What does that have to do with os.path.exists()?

NTFS file systems use UTF-16 encoded strings. For typical mostly-ASCII 
pathnames, the bytes on disk are *full* of NUL bytes. If the 
implementation detail that ASCIIZ strings cannot contain NUL is important 
to you, it should be equally important that UTF-16 strings typically have 
many NULs.

They're actually both equally implementation details and utterly 
irrelevant to the behaviour of os.path.exists.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Chris Angelico
On Fri, Jun 8, 2018 at 12:16 PM, Steven D'Aprano
 wrote:
> On Thu, 07 Jun 2018 23:25:54 +1000, Chris Angelico wrote:
>> Yes, it's a bug. If someone tries a page size of zero, it'll divide by
>> zero and bomb. Great. But how is it a vulnerability? It is a
>> properly-handled exception.
>
> Causing a denial of service is a vulnerability.

Yes, but remember, anyone can build a botnet and send large numbers of
entirely legitimate requests to your server. Since no server has
infinite capacity, a DOS is inherently unavoidable. So to call
something a "DOS vulnerability", you have to show that it makes you
*more vulnerable* than simply getting overloaded with requests. For
example:

1) If the kernel allocates resources for half-open socket connections,
a malicious client can SYN-flood the server, causing massive resource
usage from relatively few packets.

2) If the language can be induced to build a hashtable using values
that all have the same hash, the CPU load required for the O(n²)
operations can easily exceed the cost of making the requests.

3) If the app inefficiently performs many database transactions for a
simple request, a plausible number of such requests could slow the
database to a crawl.

4) If a small request results in an inordinately large response, the
server's outgoing bandwidth can be saturated by a small number of
requests.

Where in this is a simple HTTP 500 from the os.stat() call worse than
a legitimate request for an actual page?

The response is small (far smaller than many legit files - consider a
web app with a large JavaScript bundle, easily multiple megabytes). It
required zero disk operations, so it's as fast as returning a file
from cache. The only way it's more expensive is the actual exception
handling code itself, and if you reckon someone can DOS a server via
the cost of throwing and catching exceptions, I'm going to have to ask
for some serious measurements.

Apart from the one odd bug with SimpleHTTPServer not properly sending
back 500s, I very much doubt that the original concern - namely that
os.path.exists() and os.stat() raise ValueError if therels a %00 in
the URL - can be abused effectively.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Richard Damon
On 6/7/18 9:17 PM, Steven D'Aprano wrote:
> On Thu, 07 Jun 2018 15:38:39 -0400, Dennis Lee Bieber wrote:
>
>> On Fri, 1 Jun 2018 23:16:32 + (UTC), Steven D'Aprano
>>  declaimed the following:
>>
>>> It should either return False, or raise TypeError. Of the two, since
>>> 3.14159 cannot represent a file on any known OS, TypeError would be more
>>> appropriate.
>>>
>>  I wouldn't be so sure of that...
> I would.
>
> There is no existing file system which uses floats instead of byte- or 
> character-strings for file names. If you believe different, please name 
> the file
>
>
>> Xerox CP/V allowed for embedding
>> non-printable characters into file names
> Just like most modern file systems.
>
> Even FAT-16 supports a range of non-ASCII bytes with the high-bit set 
> (although not the control codes with the high-bit cleared). Unix file 
> systems typically support any byte except \0 and /. Most modern file 
> systems outside of Unix support any Unicode character (or almost any) 
> including ASCII control characters.
>
> https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
>
>
>
This does bring up an interesting point. Since the Unix file system
really has file names that are collection of bytes instead of really
being strings, and the Python API to it want to treat them as strings,
then we have an issue that we are going to be stuck with problems with
filenames. If we assume they are utf-8 encoded, then there exist
filenames that will trap with invalid encodings  (if for example the
name were generated on a system that was using Latin-1 as an 8 bit
character set for file names). On the other hand, if we treat the file
names as 8 bit characters by themselves, if the system was using utf-8
then we are mangling any characters outside the basic ASCII set.
Basically we hit to old problem of confusing bytes and strings.
Ultimately we have a fundamental limitation with trying to abstract out
the format of filenames in the API, and we need a back door to allow us
to define what encoding to use for filenames (and be able to detect that
it doesn't work for a given file, and change it on the fly to try
again), or we need an alternate API that lets us pass raw bytes as file
names and the program needs to know how to handle the raw filename for
that particular file system.

-- 
Richard Damon

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Valid encodings for a Python source file

2018-06-07 Thread Ben Finney via Python-list
Daniel Glus  writes:

> I'm trying to figure out the entire list of possible encodings for a Python
> source file - that is, encodings that can go in a PEP 263
>  encoding specification, like #
> -*- encoding: foo -*-.

What if the answer is not an emunerated set of encodings? That is, I am
pretty sure the set isn't specified, to allow the encoding to be
negotiated. Whatever the interpreter recognises as an encoding can be
the encoding of the source.

So, I guess that leads to the question: Why do you need it to be an
exhaustive set (rather than deliberately unspecified)? What are you
hoping to do with that information?

-- 
 \   “Good design adds value faster than it adds cost.” —Thomas C. |
  `\  Gale |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Ben Finney
Richard Damon  writes:

> This does bring up an interesting point. Since the Unix file system
> really has file names that are collection of bytes instead of really
> being strings, and the Python API to it want to treat them as strings,
> then we have an issue that we are going to be stuck with problems with
> filenames.

I agree with the general statement “we are going to be stuck with
problems with filenames”; the world of filesystems is messy, which will
always cause problems.

With that said, I don't agree that “the Python API wants to treat
[file paths] as strings”. The ‘os’ module explicitly promises to treat
bytes as bytes, and text as text, in filesystem paths:

Note: All of these functions accept either only bytes or only string
objects as their parameters. The result is an object of the same
type, if a path or file name is returned.

https://docs.python.org/3/library/os.path.html>

There is a *preference* for text, it's true. The opening paragraph
includes this:

Applications are encouraged to represent file names as (Unicode)
character strings.

That is immediately followed by more specific advice that says when to
use bytes:

Unfortunately, some file names may not be representable as strings
on Unix, so applications that need to support arbitrary file names
on Unix should use bytes objects to represent path names. Vice
versa, using bytes objects cannot represent all file names on
Windows (in the standard mbcs encoding), hence Windows applications
should use string objects to access all files.

(That needs IMO a correction, because as already explored in this
thread, it's not Unix or Windows that makes the distinction there. It's
the specific *filesystem type* which records either bytes or text, and
that is true no matter what operating system happens to be reading the
filesystem.)

> Ultimately we have a fundamental limitation with trying to abstract out
> the format of filenames in the API, and we need a back door to allow us
> to define what encoding to use for filenames (and be able to detect that
> it doesn't work for a given file, and change it on the fly to try
> again), or we need an alternate API that lets us pass raw bytes as file
> names and the program needs to know how to handle the raw filename for
> that particular file system.

Yes, I agree that there is an unresolved problem to explicitly declare
the encoding for filesystem paths on ext4 and other filesystems where
byte strings are used for filesystem paths.

-- 
 \   “Give a man a fish, and you'll feed him for a day; give him a |
  `\religion, and he'll starve to death while praying for a fish.” |
_o__)   —Anonymous |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python web server weirdness SOLVED

2018-06-07 Thread Gregory Ewing

Steven D'Aprano wrote:
Never mind -- it turned out I had an "index.html" file in the directory 
which had been wget'ed from LiveJournal.


That's okay, then. The other possibility was that your computer
had been recruited into an evil botnet set up by LiveJournal
to create backup servers for their site...

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: Sorting NaNs

2018-06-07 Thread Gregory Ewing

Michael Lamparski wrote:


In any case, it's verifiably not true for CPython.


Yes, CPython uses a particularly good PRNG. You may not be
as lucky using libraries that come with other languages.
A great many PRNG algorithms have been proposed, and a
good proportion of them produce 32-bit ints as their
basic output.

Of course, this isn't necessarily a problem, since
you can always stick two of them together if you need
more bits. But for the majority of purposes, 32 bits
is probably all you need anyway.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: Valid encodings for a Python source file

2018-06-07 Thread Terry Reedy

On 6/7/2018 4:40 PM, Daniel Glus wrote:

I'm trying to figure out the entire list of possible encodings for a Python
source file - that is, encodings that can go in a PEP 263
 encoding specification, like #
-*- encoding: foo -*-.


For new code for python 3, don't use an encoding cookie.  Use an editor 
that can save in utf-8 and tell it to do so if it does not do so by default.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Stefan's headers [was:Names and identifiers]

2018-06-07 Thread Thomas Jollans
On 07/06/18 22:36, Peter Pearson wrote:

> X-Copyright: (C) Copyright 2018 Stefan Ram. All rights reserved. Distribution 
> through any means
>   other than regular usenet channels is forbidden. It is forbidden to publish 
> this article in the
>   Web, to change URIs of this article into links,and to transfer the 
> body without this
>   notice, but quotationsof parts in other Usenet posts are allowed.

As discussed previously [1], this arguably means that it's best if you
don't even quote his messages if your posts are mirrored on the mailing
list (as most people's are).

[1]. https://mail.python.org/pipermail/python-list/2017-November/728635.html
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 22:56:49 -0400, Richard Damon wrote:

> or we need an alternate API that lets us pass raw bytes as file names

Guido's Time Machine strikes again.

All the path related functions, include open(), take arguments as either 
bytes or strings.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list