[Python-Dev] embedding Python interpreter in non-console windows application

2010-02-16 Thread stephen
Hello,

THE PROBLEM:
  I am having a problem that I have seen asked quite a bit on the web, with
little to no follow up.
The problem is essentially this. When embedding (LoadLibraryA()) the python
interpreter dll
in a non-windows application the developer must first create a console for
python to do output/input with.
I properly initialize the CRT and AllocConsole() to do this. I then
GetSTDHandle() for stdin and stdout accordingly
and open those handles with the requisite flags "read" for STDIN and "write"
for stdout. This all works great
and is then verified and tested to work by printf() and fgets(). This issue
however happens when attempting
to PyRun_InteractiveLoop() and PyRun_SimpleString(). A
PyRun_SimpleString("print 'test'") displays nothing in my
freshly allocated console window. Similarly a PyRun_InteractiveLoop(stdin,
NULL); yields nothing either even though
the line printf("testing"); directly ahead of it works just fine. Does
anyone have insight on how I can make this work
with the freshly allocated console's stdin/stdout/stderr?

SPECULATION:
That is the question, so now on to the speculation. I suspect that something
in the python runtime doesn't "get handles"
correctly for STDIN and STDOUT upon initialization. I have perused the
source code to find out exactly how this is done
and I suspect that it starts in PyInitializeEx with calls to
PySys_GetObject("stdin") and "stdout" accordingly. However I
don't actually see where this translates into the Python runtime checking
with the C-runtime for the "real" handles to STDIN and STDOUT. I dont ever
see the Python runtime "ask the system" where his handles to STDIN and
STDOUT are.

SUBSEQUENT QUESTION:
Is there anything I can do to initialize the Python interpreter (running as
a dll) pointing him at his appropriate STDIN and STDOUT
handles?
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Doc problems

2006-09-28 Thread stephen
Josiah Carlson writes:

 > fine).  While I have heard comments along the lines of "the docs could
 > be better", I've never heard the claim that the Python docs are "lousy".

FYI, I have heard this, recently, from Tom Lord (aka developer of
Arch, rx, guile, etc).  Since he also took a swipe at Emacsen, I
pressed him on what he meant.  He immediately backtracked on "(all)
Python docs" and "lousy", but did say that in his opinion scripting
languages that provide docstrings have lost a fair amount of coherence
in their documentation, and that Python's are consistent with the
general trend.  (He's started using Python relatively recently and
does not claim a historical perspective.)

What is lost according to him is information about how the elements of
a module work together.  The docstrings tend to be narrowly focused on
the particular function or variable, and too often discuss
implementation details.  On the other hand, manuals tend to become
either tutorials or compedia of the docstrings.

 > If there are "rampant criticisms" of the Python docs, then those that
 > are complaining should take specific examples of their complaints to the
 > sourceforge bug tracker and submit documentation patches for the
 > relevant sections.

What they *should* do, but don't, is not necessarily a reflection on
the accuracy of what they say.

FWIW ... I find the documentation for the language, the standard
library, and the Python applications I use quite adequate for my own
use.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python Doc problems

2006-09-28 Thread stephen
xah lee writes:

 > anyway, i've rewrote the Python's RE module documentation, at:
 >   http://xahlee.org/perl-python/python_re-write/lib/module-re.html

-1

The current docs could be improved (but not by me, at least not
today), but I don't consider the general direction of Xah's edits
desirable.  Eg, the current table of contents is just as accurate and
more precise than Xah's top node, which makes navigation faster for
someone who knows what he forgot.  In general his changes
improve the "narrative flow", but for me that's a very low priority in
a reference manual, while the cost in loss of navigability of his
changes is pretty high for me.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-24 Thread stephen
Talin writes:
 > (one additional postscript - One thing I would be interested in is an 
 > approach that unifies file paths and URLs so that there is a consistent 
 > locator scheme for any resource, whether they be in a filesystem, on a 
 > web server, or stored in a zip file.)

+1

But doesn't file:/// do that for files, and couldn't we do something
like zipfile:///nantoka.zip#foo/bar/baz.txt?  Of course, we'd want to
do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too.  That
way leads to madness

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread stephen
Scott Dial writes:
 > [EMAIL PROTECTED] wrote:
 > > Talin writes:
 > >  > (one additional postscript - One thing I would be interested in is an 
 > >  > approach that unifies file paths and URLs so that there is a consistent 
 > >  > locator scheme for any resource, whether they be in a filesystem, on a 
 > >  > web server, or stored in a zip file.)
 > > 
 > > +1

 > It would make more sense to register protocol handlers to this magical 
 > unification of resource manipulation.

I don't think it's that magical, and it's not manipulation, it's
location.

The question is, register where and on what?  For example on my Mac
there are some PDFs I want to open in Preview and others in Acrobat.
To the extent that I have some classes which are one or the other, I
might want to register the handler to a wildcard path object.

 > But allow me to perform my first channeling of Guido.. YAGNI.

True, but only because when I do need that kind of stuff I'm normally
writing Emacs Lisp, not Python.  We have a wide variety of functions
for manipulating path strings, and they make exactly the distinction
between path and inode/content that Talin does (where a path is being
manipulated, the function has "filename" in its name, where a file or
its metadata is being accessed, the function's name contains "file").
Nonetheless there are two or three places where programmers I respect
have chosen to invent path classes to handle hairy special cases.
These classes are very useful in those special cases.

One place where this gets especially hairy is in the TRAMP package,
which allows you to construct "remote paths" involving (for example)
logging into host A by ssh, from there to host B by ssh, and finally a
"relay download" of the content from host C to the local host by scp.
The net effect is that you can specify the path in your "open file"
dialog, and Emacs does the rest automatically; the only differences
the user sees between that and a local file is the length of the path
string and the time it takes to actually access the contents.

Once you've done that, that process is embedded into Emacs's notion of
the "current directory", so you can list the directory containing the
resource, or access siblings, very conveniently.

I don't expect to reproduce that functionality in Python personally,
but such use cases do exist.  Whether a general path class can be
invented that doesn't accumulate cruft faster than use cases is
another issue.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-05 Thread stephen
Michael Urman writes:

 > Ah, but how do you know when that's wrong? At least under ftp:// your
 > root is often a mid-level directory until you change up out of it.
 > http:// will tend to treat the targets as roots, but I don't know that
 > there's any requirement for a /.. to be meaningless (even if it often
 > is).

ftp and http schemes both have authority ("host") components, so the
meaning of ".." path components is defined in the same way for both by
section 5 of RFC 3986.

Of course an FTP server is not bound to interpret the protocol so as
to mimic URL semantics.  But that's a different question.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Stephen Hansen
On Mon, Aug 4, 2014 at 12:12 AM, Larry Hastings  wrote:

>
> Several people have said they found the name "nullable" surprising,
> suggesting I use another name like "allow_none" or "noneable".  I, in turn,
> find their surprise surprising; "nullable" is a term long associated with
> exactly this concept.  It's used in C# and SQL, and the term even has its
> own Wikipedia page:
>

The thing is, "null" in these languages are not the same thing. If you look
to the various database wrappers there's a lot of controversy about just
how to map the SQL NULL to Python: simply mapping it to Python's None
becomes strange because the semantics of a SQL NULL or NULL pointer and
Python None don't exactly match. Not all that long ago someone was making
an argument on this list to add a SQLNULL type object to better map SQL
NULL semantics (regards to sorting, as I recall -- but its been awhile)

Python has None. Its definition and understanding in a Python context is
clear. Why introduce some other concept? In Python its very common you pass
None instead of an other argument.


> Before you say "the term 'nullable' will confuse end users", let me remind
> you: this is not user-facing.  This is a parameter for an Argument Clinic
> converter, and will only ever be seen by CPython core developers.  A group
> which I hope is not so easily confused
>

Yet, my lurking observation of argument clinic is it is all about clearly
defining the C-side of how things are done in Python API's. It may not
confuse 'end users', but it may confuse possible contributors, and simply
add a lack of clarity to the situation.

Passing None in place of another argument is a very Pythonic thing to do;
why confuse that by using other words which imply other semantics? None is
a Python thing with clear semantics in Python; allow_none quite accurately
describes the Pythonic thing described here, while 'nullable' expects for
domain knowledge beyond Python and makes assumptions of semantics.

/re-lurk

--S
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-20 Thread Stephen Hansen
>
> Sounds great right?  Everybody will be happy!  So let's nail it down! If I
> was in charge, here's what I'd do:
>
> * standardise the syntax for type hints in 3.5, as per PEP484
> * but: recommend the use of stub files as the preferred place to store
> hints
> * and: deprecate function annotations in the core language
> * remove them from the core language altogether in 3.6
>

Personally, I'm not all that keen on the verbosity of the syntax; I'm sad
that List[int] has to be how you spell things instead of just [int], but
I'm sure there's reasons that can't work.

That said, I hate stub files. I understand why they may be needed sometimes
so accept them as a necessary evil, but that doesn't make them a goal to
me. Stub files become "optional headers" which I'll have to keep in sync
with the actual code -- which in my opinion, just about guarantees a
maintenance burden that will fall by the side of the road. If I have to
look at another file know or change the function arguments in the code I'm
working on, that hurts readability, too.

Will it take some getting used to, this syntax? Yes. At one point I thought
comprehensions and ternary expressions were unreadable. I use them all the
time now and find them very elegant. I'm doubtful I'll ever find type hints
/elegant/, but I'm pretty sure they won't be "ugly" forever. Ugly has a lot
to do with familiarity. It's also deeply subjective.

But its an objective reality, imho, that having to maintain and sync up
function definitions in *two different files* is a burden. And that is a
burden I really don't want to deal with.

--Stephen
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python vulnerabilities

2017-09-11 Thread Stephen Michell
I am new to this list. 

Skip suggested that I join. 

I convene ISO/IEC/JTC1SC22/WG23 Programming Languages Working Group. We produce 
a suite of international technical reports that document vulnerabilities in 
programming that can lead to serious safety and security breaches. 

We published TR 24772 "Guidance to avoiding programming language 
vulnerabilities through language selection and use" in 2010 and again in 2013. 
Edition one was a language independent look at such vulnerabilities. Edition 
two added new vulnerabilities plus language specific annexes for Ada, C, 
Python, PHP, Ruby, and Spark. 

For this round, we have split the document into parts and are publishing the 
language specific parts separately. We have added a few new vulnerabilities, 
mostly associated with concurrency and object orientation for this iteration. 

We target the team lead that guides and writes coding standards for an 
organization, as opposed to the general programmer. 

We plan to ballot and publish in 2018 TR 24772-1, the language independent 
Part, as well as -2 Ada, -3 C, -4 Python and -8 Fortran. 

Our Python Part needs completion to address the new vulnerabilities documented. 
We want to do justice to all languages that we work with. We need experts to 
help us complete the document, and then to review it. I have had initial 
conversations with one expert. We hope for a bit more if possible. I

If interested, please contact me as listed below. 

Our document list is at www.open-std.org/JTC1/sc22/wg23. 

Thank you. 

Stephen Michell
Maurya Software
stephen dot michell at maurya dot on dot ca
Phone: 1-613-299-9047___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python possible vulnerabilities in concurrency

2017-11-13 Thread Stephen Michell
I am looking for one or two experts to discuss with me how Python concurrency 
features fit together, and possible vulnerabilities associated with that.

TR 24772 lists 5 vulnerabilities associated with 

1. activating threads, tasks or pico-threads
2. Directed termination of threads, tasks or pico-threads
3. Premature termination of threads, tasks or pico-threads
4. Concurrent access to data shared between threads, tasks or pico-threads,   
and
5. Lock protocol errors for concurrent entities 

I need to document how these appear (or don’t appear) in Python. The writeups 
would possibly swamp this email reflector, so I am looking for a small number 
of people to review these sections of our language-independent document and 
discuss with me how these are handled in Python. 

I have a good background in these issues, but no relevant experience with 
Python. 

Please contact me at [email protected] 
<mailto:[email protected]> to respond directly.

Thank you

…stephen michell
Convenor
ISO/IEC/JTC 1/SC 22/WG 23 Programming Language Vulnerabilities Working Group___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Sets, Dictionaries

2018-03-29 Thread Stephen Hansen
On Wed, Mar 28, 2018, at 9:14 PM, Julia Kim wrote:
> My suggestion is to change the syntax for creating an empty set and an 
> empty dictionary as following.
> 
> an_empty_set = {}
> an_empty_dictionary = {:}
> 
> It would seem to make more sense.

The amount of code this would break is astronomical. 

-- 
Stephen Hansen
  m e @ i x o k a i  . i o
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-06 Thread Stephen Hansen
On Fri, Feb 5, 2016, at 10:33 AM, Emile van Sebille wrote:
> On 2/5/2016 9:37 AM, Alexander Walters wrote:
> >
> > On 2/5/2016 12:27, Emile van Sebille wrote:
> >> On 2/1/2016 9:20 AM, Ethan Furman wrote:
> >>> On 02/01/2016 08:40 AM, R. David Murray wrote:
> >> 
> >>>> On the other hand, if the distros go the way Nick has (I think) been
> >>>> advocating, and have a separate 'system python for system scripts' that
> >>>> is independent of the one installed for user use, having the
> >>>> system-only
> >>>> python be frozen and sourceless would actually make sense on a
> >>>> couple of
> >>>> levels.
> >>>
> >>> Agreed.
> >>
> >> Except for that nasty licensing issue requiring source code.
> >>
> >> Emile
> > Licensing requires, in the GPL at least, that the *modified* sources be
> > made *available*, not that they be shipped with the product. Looking at
> > the Python license, and what tools already do, there is zero need to
> > ship the source to stay compliant.
> 
> Hmm, the annotated Open Source Definition explicitly states "The program 
> must include source code" -- how did I misinterpret that?

Couple things.

First, the OSD is not authoritative. Python's license establishes the
rules of its distribution: that Python's license is considered
compatible with the OSD doesn't actually mean your reading of anything
on the OSD page as having any binding meaning.

Second, OSD's Rule 2 means that those who are distributing Python -- the
PSF, originally -- must provide source code if they're distributing it
under Python's license, but it doesn't actually mean it must be packaged
with it in every download. In fact, its not today. The standard library
source is included in normal downloads, but the C source of Python
isn't. But you can download it readily though, so that's fine. Its fully
compliant with the OSD.

But! If Debian (pulling them out of a hat randomly) is distributing
Python, they aren't the PSF, and notably are not bound by the OSD rules,
only by Python's license terms. The PSF satisfied their requirements to
the licensing terms when releasing Python, but now Debian has Python,
and they are distributing it-- that's an entirely separate act, and you
must look at them as a separate actor in terms of the license. They
don't have to distribute it in the same license. They must be ABLE to
(as OSD's Rule 3 says), but they don't HAVE to. Some random person can
take Python, rename it Snakey, and release it under almost any license
they want and give no one the source code at all. 

Python has from the beginning allowed this:its actually in quite a few
closed source / proprietary products without ever advertising it and
providing no source, entirely legally and ethically -- Python's gone out
of its way to support this sort of use-case. 

As it happens, Debian usually distributes something very close to the
official release (sometimes they backport patches and such), and always
does so under the same license as Python (AFAICT), but they don't *have*
to. 

GPL is copyleft and requires its derivative works to be GPL'd (or at
least, no more restrictive then GPL)-- so in GPL, to distribute it you
MUST distribute it under GPL-compatible terms. Python is a permissive
license and allows anyone to do basically anything, INCLUDING produce
closed source releases if someone wanted to, or just release
modifications or modules that are available under different licenses. 

The OSD encompasses both ends of the spectrum: the GPL's mandate of
source access and the OSD's mandate of the receiver to be able to
distribute in the same terms they received (notably, NOT the same terms
it was originally released under).

-- 
Stephen Hansen
  m e @ i x o k a i  . i o
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Stephen Hansen
For not caring much, your own stubbornness is quite notable throughout this
discussion. Stones and glass houses. :)

That said:

Twisted and Mercurial aren't the only ones who are hurt by this, at all.
I'm aware of at least two other projects who are actively hindered in their
support or migration to Python 3 by the bytes type not having some basic
functionality that "strings" had in 2.0.

The purity crowd in here has brought up that it was an important and
serious decision to split Text from Bytes in Py3, and I actually agree with
that. However, it is missing some very real and very concrete use-cases --
there are multiple situations where there are byte streams which have a
known text-subset which they really, really do need to operate on.

There's been a number of examples given: PDF, HTTP, network streams that
switch inline from text-ish to binary and back-again.. But, we can focus
that down to a very narrow and not at all uncommon situation in the latter.

Look at the HTTP Content-Length header. HTTP headers are fuzzy. My
understanding is, per the RFCs, their body can be arbitrary octets to the
exclusion of line feeds and DELs-- my understanding may be a bit off here,
and please feel free to correct me -- but the relevant specifications are a
bit fuzzy to begin with.

To my understanding of the spec, the header field name is essentially an
ASCII text field (sans separator), and the body is... anything, or nearly
anything. This is HTTP, which is surely one of the most used protocols in
the world.

The need to be able to assemble and disassemble such streams of that is a
real, valid use-case.

But looking at it, now look to the Content-Length header I mentioned. It
seems those who are declaring a purity priority in bytes/string separation
think it reasonable to do things like:

  headers.append((b"Content-Length": ("%d" %
(len(content))).encode("ascii")))

Or something. In the middle of processing a stream, you need to convert
this number into a string then encode it into bytes to just represent the
number as the extremely common, widely-accessible 7-bit ascii subset of its
numerical value. This isn't some rare, grandiose or fiendish undertaking,
or trying to merge Strings and Bytes back together: this is the simple
practical recognition that representing a number as its ascii-numerical
value is actually not at all uncommon.

This position seems utterly astonishing in its ridiculousness to me. The
recognition that the number "123" may be represented as b"123" surprises me
as a controversial thing, considering how often I see it in real life.

There is a LOT of code out there which needs a little bit of a middle
ground between bytes and strings; it doesn't mean you are giving way and
allowing strings and bytes to merge and giving up on the Edict of
Separation. But there are real world use-cases where you simply need to be
able to do many basic "String" like operations on byte-streams.

The removal of the ability to use interpolation to construct such byte
strings was a major regression in python 3 and is a big hurdle for more
then a few projects to upgrade.

I mean, its not like the "bytes" type lacks knowledge of the subset of
bytes that happen to be 7-bit ascii-compatible and can't perform text-ish
operations on them--

  Python 3.3.3 (v3.3.3:c3896275c0f6, Nov 18 2013, 21:18:40) [MSC v.1600 32
bit (Intel)] on win32
  Type "help", "copyright", "credits" or "license" for more information.
  >>> b"stephen hansen".title()
  b'Stephen Hansen'

How is this not a practical recognition that yes, while bytes are byte
streams and not text, a huge subset of bytes are text-y, and as long as we
maintain the barrier between higher characters and implicit conversion
therein, we're fine?

I don't see the difference here. There is a very real, practical need to
interpolate bytes. This very real, practical need includes the very real
recognition that converting 12345 to b'12345' is not something weird,
unusual, and subject to the thorny issues of Encodings. It is not violating
the doctrine of separation of powers between Text and Bytes.

Personally, I won't be converting my day job's codebase to Python 3 anytime
soon (where 'soon' is defined as 'within five years, assuming a best-case
scenario that a number of third-party issues are resolved. But! I'm aware
and involved with other projects, and this has bit two of them
specifically. I'm sure there are others who are not aware of this list or
don't feel comfortable talking on it (as it is, I encouraged one of the
project's coder to speak up, but they thought the question was a lost one
due to  previous responses on the original issue ticket and gave up.).

On Fri, Jan 10, 2014 at 6:04 PM, Antoine Pitrou  wrote:

> On Fri, 10 

[Python-Dev] The curious case of 255 function arguments

2018-08-05 Thread Stephen McDowell
Hello Python Gurus,

TL;DR: 3.7 released functions having greater than 255 arguments.  Despite
explicit checks for this in 2.x, no such limit is actually imposed -- why?

In the 3.7 release notes "Other Language Changes" section (
https://docs.python.org/3.7/whatsnew/3.7.html#other-language-changes), the
first bullet point denotes

> More than 255 arguments can now be passed to a function, and a function
can now have more than 255 parameters. (Contributed by Serhiy Storchaka in
bpo-12844 <https://bugs.python.org/issue12844> and bpo-18896
<https://bugs.python.org/issue18896>.)

Now lets get something straight: unless I want to exclusively support
Python 3.7 or higher, I must make sure I obey the <255 rule.  Use *args //
**kwargs, etc.  I'm totally ok with that, 2020 is already here in my mind ;)

Curiosity is the reason I'm reaching out.  Upon further investigation and
some discussion with like-minded Python enthusiasts, the code being patched
by Serhiy Storchaka is present in e.g., Python 2.7 (
https://github.com/python/cpython/blob/2.7/Python/ast.c#L2013-L2016)

if (nargs + nkeywords + ngens > 255) {
  ast_error(n, "more than 255 arguments");
  return NULL;
}

Despite that code, as demonstrated with the supplemental output in the post
script, *no 2.x versions fail with >255 arguments*.  In contrast, 3.x where
x<7 all do fail (as expected) with a SyntaxError.  To test this, I tried
every minor release of python (excluding v1, arbitrarily choosing the
latest patch release of a minor version) with the following snippet via the
-c flag

/path/to/pythonX.Y -c 'exec("def foo(" + ", ".join(["a" + str(i) for i
in range(1, 300)]) + "): pass")'

Which tries to construct a function

def foo(a0, a1, ..., a299): pass

I've looked at the C code for a while and it is entirely non-obvious what
would lead to python *2* *allowing* >255 arguments.  Anybody happen to know
how / why the python *2* versions *succeed*?

Thank you for reading, this is not a problem, just a burning desire for
closure (even if anecdotal) as to how this can be.  I deeply love python,
and am not complaining!  I stumbled across this and found it truly
confounding, and thought the gurus here may happen to recall what changed
in 3.x that lead the the error condition actually being asserted :)

Sincerely,

Stephen McDowell

P.S. On a Fedora 25 box using GCC 6.4.1, I lovingly scripted the
installation of all the python versions just to see if it truly was a 2.x /
3.x divide.  The results of running `python -V` followed by the `python -c
'exec("def foo...")'` described above, with some extra prints for clarity
are as follows (script hackily thrown together in ~30minutes not included,
so as not to make your eyes bleed):


Python 2.0.1
==> Greater than 255 Arguments supported

Python 2.1.3
==> Greater than 255 Arguments supported

Python 2.2.3
==> Greater than 255 Arguments supported

Python 2.3.7
==> Greater than 255 Arguments supported

Python 2.4.6
==> Greater than 255 Arguments supported

Python 2.5.6
==> Greater than 255 Arguments supported

Python 2.6.9
==> Greater than 255 Arguments supported

Python 2.7.15
==> Greater than 255 Arguments supported

Python 3.0.1
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1
SyntaxError: more than 255 arguments

Python 3.1.5
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1
SyntaxError: more than 255 arguments

Python 3.2.6
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1
SyntaxError: more than 255 arguments

Python 3.3.7
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1
SyntaxError: more than 255 arguments

Python 3.4.9
Traceback

Re: [Python-Dev] The curious case of 255 function arguments

2018-08-07 Thread Stephen McDowell
Hi Andrea and Serhiy,

Thank you for your responses and clarifying that it is specifically the
CALL_FUNCTION.  I tested this in my megascript and it will fail when trying
to call the functions directly and receive an error then (Py 2.x: fail at
call invocation, Py 3.y w/ y<7: fail at function definition).

@Serhiy I looked through the commits and had found
https://github.com/python/cpython/commit/5bb8b9134b0bb35a73c76657f41cafa3e4361fcd#diff-4d35cf8992b795c5e97e9c8b6167cb34
but the commit that removed the 255 checks also explains that this is
specifically about the call function (
https://github.com/python/cpython/commit/214678e44bf7773c0ed9c3684818354001d8f9ca#diff-4d35cf8992b795c5e97e9c8b6167cb34
), so indeed I should have been able to answer this myself.

The reason why I originally had encountered this was (as discussed in one
of the bug reports) from code that was generating a class hierarchy to
represent Doxygen's XML schema.  The class constructors had >255 arguments,
but in executing the code it actually does still work in python 2.x.  The
reason is because all of the arguments are defaulted to None, and during
execution of typical sample XML files, the explicit construction with all
>255 arguments virtually never happens.

f.write("def foo_2({0}):\n".format(",
".join(["a{0}=None".format(str(i)) for i in range(300)])))
f.write("print('foo_2 executed')\n\n")
# ... in generated __main__ ...
f.write("foo_2()\n\n")

foo_2() will succeed in python 2.x because the CALL_FUNCTION is not
explicitly getting more than 255 parameters.  Very interesting!

Thank you both again for your responses, I am grateful to finally
understand the way in which success / failure works here :)

-Stephen


On Mon, Aug 6, 2018 at 2:17 AM, Serhiy Storchaka 
wrote:

> 06.08.18 08:13, Stephen McDowell пише:
>
>> I've looked at the C code for a while and it is entirely non-obvious what
>> would lead to python *2* /allowing/ >255 arguments.  Anybody happen to know
>> how / why the python *2* versions *succeed*?
>>
>
> The error message is misleading. It should be "more than 255 parameters".
> This limitation is due to the optimization used in Python 3 for call
> variables (see https://bugs.python.org/issue12399 for details).
>
> In all versions <3.7 there is a limitation on the number of explicit
> function arguments because of the limitation of the CALL_FUNCTION opcode.
>
> Thank you for reading, this is not a problem, just a burning desire for
>> closure (even if anecdotal) as to how this can be.  I deeply love python,
>> and am not complaining!  I stumbled across this and found it truly
>> confounding, and thought the gurus here may happen to recall what changed
>> in 3.x that lead the the error condition actually being asserted :)
>>
>
> Read the history of the code. Commit messages usually contain explanations
> or references to issues.
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/sjm324%
> 40cornell.edu
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The end of 2.7

2013-04-08 Thread Stephen Hansen
On Sun, Apr 7, 2013 at 6:53 AM, Christian Tismer wrote:

>  On 07.04.13 14:10, Skip Montanaro wrote:
>
> Where I work (a trading firm that uses Python as just one of many
> different pieces of technology, not a company where Python is the core
> technology upon which the firm is based) we are only just now
> migrating from 2.4 to 2.7. I can't imagine we'll have migrated to
> Python 3 in two years.  It's not like we haven't seen this coming, but
> you can only justify moving so fast with technology that already
> works, especially if, like Python, you use it with lots of other
> packages (most/all of which themselves have to be ported to Python 3)
> and in-house software.
>
> I think the discussion should focus on who's left on 2.x and why, not,
> "yeah, releases every six months for the next couple years ought to do
> it."
>
>
>
> when I read this, I was slightly shocked. You know what?
> """
> We are pleased to announce the release of *Python 2.4, final* on November
> 30, 2004.
> """
>
> I know that companies try to save (time? money?) something by not upgrading
> software, and this is extremely annoying.
>

I'm in the same boat as Skip (just now moving from 2.4 to 2.7), and Python
*is* a core technology for us. It has nothing really to do with saving time
or money, its about priorities. The transition from 2.3 to 2.4 was actually
fairly painful (don't ask me why, I don't even remember anymore), but we
got stuck on 2.4 not by any specific decision -- it simply worked, and our
time was always focused upon solving problems and improving our software
itself.

Could we have solved our problems easier if we upgraded Python and had new
tools? Some, yes. (Some features we have added had me actually walking
through third party code bases and backporting it -- converting with to
try/finally is an amusing big one for example)

For one thing, even with this relatively ancient Python, we almost never
ran into bugs. It just worked and worked fine, so when we looked at our
development plan the list of feature requests and issues for various
customers (especially those that were potential new clients) overrode
"infrastructure" upgrades as priorities.

However, in a huge system that has many tens of thousands of lines of code,
doing a platform upgrade is just a serious endeavor -- and its often not
even Python's fault itself, but the reality that it means we're going to be
upgrading *everything* and involves a much more involved QA cycle and often
runs into third party software. We are finally upgrading now because the
time to work around certain bugs in both Python and third-party libraries
that no longer support 2.4 are enough for us to say, okay, we finally
really do need to get this done.

Migration to Python 3 ... IF it ever happens is more of a question then
when.

That's not a indictment of Python 3 or a problem with the current plan (for
what its worth, the bugfix every 6 months until 5 years is up seems totally
reasonable).

Any new product we do, I'd seriously consider starting from Python 3.
(Though PyPy supporting Py3 would help that argument a lot) The case for
migrating existing products is a lot harder to make.


But I think every employee (including you) can quite easily put some
> pressure
> on his company by claiming that Python 2.x is a dead end, and everybody is
> about to move on to 3.x.
> This does not have to be true, I just recognize that by claiming it and
> doing it
> with your projects, the movement becomes a reality. Just say that we all
> need to
> move on and cannot care about companies that ignore this necessity.
>

The thing is, 2.7 works. Some third-party libraries we rely upon have no
clear sign for when they will be ported (such as wxPython), and though we
are transitioning away from certain others (omniORB for Apache Thrift for
example), that process itself is planned to be a gradual thing for the next
year, at least.

My concern is for the health of my company, and happiness of my customers;
I love Python and am an advocate for it, but in my day job, pushing things
forward is just about at the bottom of my list of concerns. (Though, our
migration to 2.7 is actually part of a long term strategic plan to embrace
pypy)

And now I go back to lurking.

--Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tracker archeology

2009-02-10 Thread Stephen Thorne
On 2009-02-10, Tarek Ziadé wrote:
> On Tue, Feb 10, 2009 at 2:23 PM, Daniel (ajax) Diniz  wrote:
> 
> >
> > If anyone is interested in being added as nosy for any category of
> > bugs, let me know and I'll do that as I scan the tracker.
> 
> I'll take Distutils related issues,

If you could look at a solution for http://bugs.python.org/issue1533164
I would be eternally grateful.

-- 
Regards,
Stephen Thorne
Development Engineer
NetBox Blue - 1300 737 060

NetBox Blue is proud to be a sponsor and exhibitor at IBM's Solutions 
Showcase 2009 events. These are held in Perth, Adelaide, Brisbane, Sydney and 
Melbourne in February and March. 
For more details and to register please visit: 
http://www.ibm.com/solutionsshowcase/au


Scanned by the NetBox from NetBox Blue
(http://netboxblue.com/)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] what Windows and Linux really do Re: PEP 383 (again)

2009-04-30 Thread Stephen Hansen
>
> You can't even print them without getting an error from Python.  In fact,
> you also can't print strings containing the proposed half-surrogate
> encodings either: in both cases, the output encoder rejects them with a
> UnicodeEncodeError.   (If not even Python, with its generally lenient
> attitude, can print those things, some other libraries probably will fail,
> too.)
>

I think you may be confusing two completely separate things; its a
long-known issue that the windows console is simply not a Unicode-aware
display device naturally. You have to manually set the codepage (by typing
'chcp 65001' -- that's utf8) *and* manually make sure you have a
unicode-enabled font chosen for it (which for console fonts is extremely
limited to none, and last I looked the default font didn't support unicode)
before you can even try to successfully print valid unicode. The default
codepage is 437 (for me at least; I think it depends on which language of
Windows you're using) which is ASCII-/ish/.

You have to do your test in an environment which actually supports
displaying unicode at all, or its meaningless.

Personally and for all the use cases I have to deal with at work, I would
/love/ to see this PEP succeed. Being able to query a list of files in a
directory and get them -all-, display them all to a user
(which necessitates it being converted to unicode one way or the other. I
don't care if certain characters don't display: as long as any arbitrary
file will always end up looking like a distinct series of readable and
unreadable glyphs so the user can select it clearly), and then perform
operations on any selected file regardless of whatever nonsense may be going
on underneath with confused users and encodings... in a cross-platform way,
would be a tremendous boon to future py3k porting efforts. I ramble.

If there's inconsistent encodings used by users on a posix system so that
they can only make sense of half of what the names really are... that's for
other programs to deal with. I just want to be able to access the files they
tell me they want.

For anyone who is doing something low-level, they can use the bytes API.

--Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python Library Support in 3.x (Was: email package status in 3.X)

2010-06-18 Thread Stephen Thorne
Steve Holden Wrote:
> We are also attempting to enable tax-deductible fund raising to increase
> the likelihood of David's finding support. Perhaps we need to think
> about a broader campaign to increase the quality of the python 3
> libraries. I find it very annoying that the #python IRC group still has
> "Don't use Python 3" in it's topic.  They adamantly refuse to remove it
> until there is better library support, and they are the guys who see the
> issues day in day out so it is hard to argue with them (and I don't
> think an autocratic decision-making process would be appropriate).

Yes, #python keeps the text "It's too early to use Python 3.x" in its topic.
Library support is the only reason.

-- 
Regards,
Stephen Thorne
Development Engineer
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Library Support in 3.x (Was: email package status in 3.X)

2010-06-20 Thread Stephen Thorne
On 2010-06-19, Arc Riley wrote:
> You mean Twisted support,

No. I don't.

Often, on #python, we get the situation where someone approaches us saying, "I
have this problem in my python code, why does this not work for me?" and
usually very quickly we establish the programmer has followed a tutorial or
attempted to use a library that depends on python 2, but the programmer is
running python 3.

Queried on why they are using python 3, the answer is frequently, "Because I
downloaded the latest version."

For those people, we believe it is too early to use python 3. When talking to
these people with a world view of "why shouldn't i use the latest version"
having a concrete preexisting statement in the topic we can point to is
invaluable.

We don't always ask those who are having python 3 problems to go to python2.
Often we simply explain about all strings bring unicode or print now being a
function, and the conversation dies.

There are also programmers who definately should be using python 3 for their
work. They know who they are. They do receive support in #python.

--

In writing this email to python-dev, I have reviewed my logs of #python
specifically looking for the phrase 'python 3'. Here are some packages that
were named in the conversations:

 - py2exe
 - cx_Freeze
 - twisted 
 - PIL
 - ctypes
 - email

I present this list because they are what programmers are coming to #python to
ask about, and that may be relevent to your discussion about python 3 ports.

-- 
Regards,
Stephen Thorne
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] "2 or 3" link on python.org

2010-06-24 Thread Stephen Thorne
Steve Holden Wrote:
> Given the amount of interest this thread has generated I can't help
> wondering why it isn't more prominent in python.org content. Is the
> developer community completely disjoint with the web content editor
> community?
> 
> If there is such a disconnect we should think about remedying it: a
> large "Python 2 or 3?" button could link to a reasoned discussion of the
> pros and cons as evinced in this thread. That way people will end up
> with the right version more often (and be writing Python 2 that will
> more easily migrate to Python 3, if they cannot yet use 3).
> 
> There seems to be a perception that the PSF can help fund developments,
> and indeed Jesse Noller has made a small start with his sprint funding
> proposal (which now has some funding behind it). I think if it is to do
> so the Foundation will have to look for substantial new funding. I do
> not currently understand where this funding would come from, and would
> like to tap your developer creativity in helping to define how the
> Foundation can effectively commit more developer time to Python.
> 
> GSoC and GHOP are great examples, but there is plenty of room for all
> sorts of initiatives that result in development opportunities. I'd like
> to help.

I am extremely keen for this to happen. Does anyone have ownership of this
project? There was some discussion of it up-list but the discussion fizzled.

-- 
Regards,
Stephen Thorne
Development Engineer
Netbox Blue
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "2 or 3" link on python.org

2010-06-24 Thread Stephen Thorne
On 2010-06-25, "Martin v. Löwis" wrote:
> Am 25.06.2010 01:28, schrieb Stephen Thorne:
> > Steve Holden Wrote:
> >> Given the amount of interest this thread has generated I can't help
> >> wondering why it isn't more prominent in python.org content. Is the
> >> developer community completely disjoint with the web content editor
> >> community?
> >>
> >> If there is such a disconnect we should think about remedying it: a
> >> large "Python 2 or 3?" button could link to a reasoned discussion of the
> >> pros and cons as evinced in this thread. That way people will end up
> >> with the right version more often (and be writing Python 2 that will
> >> more easily migrate to Python 3, if they cannot yet use 3).
> >>
> >> There seems to be a perception that the PSF can help fund developments,
> >> and indeed Jesse Noller has made a small start with his sprint funding
> >> proposal (which now has some funding behind it). I think if it is to do
> >> so the Foundation will have to look for substantial new funding. I do
> >> not currently understand where this funding would come from, and would
> >> like to tap your developer creativity in helping to define how the
> >> Foundation can effectively commit more developer time to Python.
> >>
> >> GSoC and GHOP are great examples, but there is plenty of room for all
> >> sorts of initiatives that result in development opportunities. I'd like
> >> to help.
> > 
> > I am extremely keen for this to happen. Does anyone have ownership of this
> > project? There was some discussion of it up-list but the discussion fizzled.
> 
> Can you please explain what "this project" is, in the context of your
> message? GSoC? GHOP?

Oh, I thought this was quite clear. I was specifically meaning the large
"Python 2 or 3" button on python.org. It would help users who want to know
what version of python to use if they had a clear guide as to what version
to download.

It doesn't help if someone goes to do greenfield development in python
if a library they depend upon has yet to be ported, and they're trying to
use python 3.

(As an addendum add pygtk to the list of libs that python 3 users on #python
are alarmed to find haven't been ported yet)

-- 
Regards,
Stephen Thorne
Development Engineer
Netbox Blue
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "2 or 3" link on python.org

2010-06-25 Thread Stephen Thorne
On 2010-06-25, "Martin v. Löwis" wrote:
> > What page were we suggesting linking to?
> 
> I don't think anybody proposed anything specific. Steve Holden
> suggested it should go to "reasoned discussion of the
> pros and cons as evinced in this thread". Stephen Thorne didn't
> propose anything specific but to have a large button.

I didn't propose anything, I heard a good idea that I'd like to see followed
through.

-- 
Regards,
Stephen Thorne
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Removing IDLE from the standard library

2010-07-10 Thread Stephen Hansen
On Sat, Jul 10, 2010 at 9:23 PM, Guilherme Polo  wrote:

> By "never had a problem" do you mean using some of the latest versions
> ? Here, running "idle" from a mac terminal and trying to type: print
> "hi" crashes when entering the quotation mark.


Huh? Works fine for me. Python 2.6.1, OSX 10.6.3, intel.

>From the lurking crowd-- Please don't consider removing IDLE until there is
a compelling replacement ready. It's better to have a limited IDE that works
everywhere (even if in a limited fashion-- people are free to try out one of
the many excellent full-featured Python IDE's out there after they advance
to that point) then not.

--Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Removing IDLE from the standard library

2010-07-11 Thread Stephen Hansen
On Sun, Jul 11, 2010 at 4:53 PM, Steve Holden  wrote:

> Stephen Hansen wrote:
> > On Sat, Jul 10, 2010 at 9:23 PM, Guilherme Polo  > <mailto:[email protected]>> wrote:
> >
> > By "never had a problem" do you mean using some of the latest
> versions
> > ? Here, running "idle" from a mac terminal and trying to type: print
> > "hi" crashes when entering the quotation mark.
> >
> >
> > Huh? Works fine for me. Python 2.6.1, OSX 10.6.3, intel.
> >
> One of the good things about the python-dev community is its commitment
> to test-driven development. If you are prepared to define "fine" as
> 'successfully runs \'print "hello"\'' then I guess we should be
> perfectly happy about IDLE.
>

Er, how hostile.

My point is, the poster made an assertion-- that you couldn't do the simple
act as launching idle from a command line, and printing Hi. Maybe they
can't, I have no idea.

I know I can. I know that  I have also opened random python files, saved
them, and ran them with IDLE. I don't use IDLE beyond that though: I live in
TextMate on my mac.

My point was not, "IDLE is perfect". My point was, "You've claimed you can't
even print out a word in IDLE, so its utterly and completely non-functional"
-- and that assertion surprises me and I challenge.

I don't define IDLE as "fine", because I'm not qualified to speak to its
larger aspects-- as I only rarely use it. But the level of utter brokenness
that the poster I was replying to spoke of, I've never seen. Across multiple
versions of Python, IDLE, and OSX.

> From the lurking crowd-- Please don't consider removing IDLE until there
> > is a compelling replacement ready. It's better to have a limited IDE
> > that works everywhere (even if in a limited fashion-- people are free to
> > try out one of the many excellent full-featured Python IDE's out there
> > after they advance to that point) then not.
> >
> 1: I refuse to see why we need a "compelling replacement" for a piece of
> software whose performance might be actively deterring people from
> taking up the language. ["Have you thought about Python?" "Yeah, but I
> tried it {meaning "I downloaded some random Python release and tried
> IDLE, which by modern standards appears completely lame"} and it
> sucked". If this is our standard for "compelling" then it appears the
> command-line interpreter is the competition.
>

The claim that IDLE is "actively deterring" people from taking up Python is
in my opinion unsupported. I know a lot of people who have and do use it,
and I am personally (in my own experience) unaware of anyone who is actively
deterred from using Python because of it. Therefore, I see no negative, and
only a positive of IDLE's presence-- and so I'd want a compelling
replacement available before that positive was wiped out.

Perhaps your experience is different.

So be it: but -- uh, really, Hostile.  I was just sharing my own experience
with using and talking to people who use IDLE. I've found it -- on the mac,
but on other platforms as well -- an adequate but limited sort of IDE. I've
found more issues with it with the people I know who use windows then mac
(in particular, details of when the subprocess runs). But my comment was
simply: it has constantly worked for me in the limited use I make of it, and
I have a positive experience with the people I know that have used it.

If your experience is different, that's fine. Perhaps your experience is
more broad, more compelling, and representative of more people.

But I, personally, would consider it a significant loss if IDLE went the way
of the dodo or a third-party module.

-- Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Pythonmac-SIG] sad state of OS X Python testing...

2010-10-05 Thread Stephen Hansen
On Sat, Oct 2, 2010 at 1:37 PM, "Martin v. Löwis" wrote:

> > I'm already running a Jython buildslave on an Intel Mac Pro which is
> > pretty underused - I'd be happy to run a CPython one there too, if
> > it'd be worthwhile.
>
> I think Bill was specifically after Snow Leopard - what system are you
> using?
>

I have a fairly recent MacPro on Snow Leopard, which I keep consistently up
to date and its connected all the time. It has more capacity then I can
really find use for.

If its still needed, I can set up buildbot to run on it today. Is it all
pull/poll oriented, or does the slave need to be connected to by the master?
Meaning, do I need to poke a hole in the firewall to allow any external
access? The BuildBot page only mentions outgoing access (or I'm
misunderstanding it).

IIUC, I just need a name/password to tell buildbot to connect to, right?

-- Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Pythonmac-SIG] sad state of OS X Python testing...

2010-10-08 Thread Stephen Hansen
On Fri, Oct 8, 2010 at 2:42 AM, Antoine Pitrou  wrote:

> On Tue, 5 Oct 2010 10:08:59 -0700
> Stephen Hansen  wrote:
> > On Sat, Oct 2, 2010 at 1:37 PM, "Martin v. Löwis"  >wrote:
> >
> > > > I'm already running a Jython buildslave on an Intel Mac Pro which is
> > > > pretty underused - I'd be happy to run a CPython one there too, if
> > > > it'd be worthwhile.
> > >
> > > I think Bill was specifically after Snow Leopard - what system are you
> > > using?
> > >
> >
> > I have a fairly recent MacPro on Snow Leopard, which I keep consistently
> up
> > to date and its connected all the time. It has more capacity then I can
> > really find use for.
>
> Now that the buildbot is up, it is recommended that you try to
> investigate the failures (and the test_ttk_guionly crash), and that you
> create bugs reports on http://bugs.python.org for them.
>

I shall, I just got busy yesterday. :)

The failure is happening just because it can't possibly succeed, I set up
the account for the buildbot in such a way that it has no access to a GUI
context. I'm going to rectify that today so I can properly test TK. The hard
crash I'll report as soon as I have a few minutes to isolate more where its
happening and thus to who all should get a report.

--Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Pythonmac-SIG] sad state of OS X Python testing...

2010-10-08 Thread Stephen Hansen
On Fri, Oct 8, 2010 at 8:00 AM, Antoine Pitrou  wrote:

>
> Hi,
>
> > The failure is happening just because it can't possibly succeed, I set
> > up the account for the buildbot in such a way that it has no access to
> > a GUI context. I'm going to rectify that today so I can properly test
> > TK.
>
> Well, a nice thing would be for tests to be properly skipped in this
> situation, rather than fail or crash :) Do you think you can try to
> write a patch for this?
>

Absolutely, that's on my TODO list. First, figuring out the buildslave
control process; then isolating where that crash is happening and reporting
it to whomever (which, I think in a cursory look, is actually the TK folks);
then figuring out how to convert no-GUI-context-possible situations into a
skip.

--S
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Build failure in test_cmd_line on OSX-x86

2010-10-08 Thread Stephen Hansen
I'm sure this has to be my configuration somehow, but I'm getting a build
failure that I don't quite know how to debug, because I can't reproduce it
when I run the test manually. Any advice would be appreciated-- I'm a
buildslave newbie :-)

I'm referring to
http://www.python.org/dev/buildbot/builders/x86%20Snow%20Leopard%203.x/builds/13/steps/test/logs/stdio

Now, when I saw this the first thing I assumed I should do is to try to
reproduce it, so I did:

  Top-2:~ pythonbuildbot$ cp -R buildarea/3.x.hansen-osx-x86/ ~/test
  Top-2:~ pythonbuildbot$ cd ~/test/build/
  Top-2:build pythonbuildbot$ ./configure --with-pydebug
--with-computed-gotos
  [snip]
  Top-2:build pythonbuildbot$ make all
  [snip]

I made sure to go check out the stdio for configure/compile so the compiling
I'd do would be with the same options as the buildslave did. Then:

  Top-2:build pythonbuildbot$ ./python.exe -m test.regrtest -uall
test_cmd_line.py
  [1/1] test_cmd_line
  1 test OK.
  [84022 refs]

Doh. So, next I try running the whole test suite:

  Top-2:build pythonbuildbot$ ./python.exe -m test.regrtest -uall
  [snip]
  [ 44/348] test_cmd_line

And it passes too: but I notice its run way earlier in the test process then
it did on the buildbot, and I remember reading awhile ago about a test
failure that happened only when a certain test ran in a certain order. So, I
go garb the randseed from the buildbot run and use it:

  Top-2:build pythonbuildbot$ ./python.exe -m test.regrtest -uall -r
--randseed=9634655
  == CPython 3.2a2+ (py3k:85321, Oct 8 2010, 08:54:05) [GCC 4.2.1 (Apple
Inc. build 5664)]
  ==   Darwin-10.4.0-i386-64bit little-endian
  ==   /Users/pythonbuildbot/test/build/build/test_python_86644
  Using random seed 9634655
  [snip]

And long story short, it gets to 201 and runs test_cmd_line in the same
order as the buildbot did, and it succeeds too, and I curse the gods of the
netherworld, and am stumped with how to proceed. Two separate buildbot runs
and this same failure happened, yet for me, no. Or I'm doing something
differently then the buildbot is, and I can't see what.

--Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Build failure in test_cmd_line on OSX-x86

2010-10-08 Thread Stephen Hansen
On Fri, Oct 8, 2010 at 10:28 AM, Antoine Pitrou  wrote:

> On Fri, 8 Oct 2010 10:02:59 -0700
> Stephen Hansen > wrote:
> >
> > And long story short, it gets to 201 and runs test_cmd_line in the same
> > order as the buildbot did, and it succeeds too, and I curse the gods of
> the
> > netherworld, and am stumped with how to proceed. Two separate buildbot
> runs
> > and this same failure happened, yet for me, no. Or I'm doing something
> > differently then the buildbot is, and I can't see what.
>
> The buildbot user probably has different locale settings. I can
> simulate the failure with:
>

I'd find that very surprising: the buildslave is running as the same user I
am running the test under, and the LANG is en_US.UTF-8 -- the default.
Granted, the slave's running under launchd, and so is launching twisted with
the tac directly -- but I can't see any part of that process which would
cause the default LANG to change.

Interestingly enough, I can't reproduce the failure with:

  Top-2:build pythonbuildbot$ PYTHONFSENCODING=latin1 ./python.exe -m
test.regrtest -uall test_cmd_line.py
  [1/1] test_cmd_line
  1 test OK.
  [84024 refs]

(and just to test--)

  Top-2:build pythonbuildbot$ PYTHONFSENCODING="utf-8" ./python.exe -m
test.regrtest -uall test_cmd_line.py
  [1/1] test_cmd_line
  1 test OK.
  [84024 refs]

But I don't think that environment variable does anything on the Mac; I'm
pretty sure the fs encoding is set as utf-8 and mandated as such in the OS.


> You should therefore see what the locale settings of the buildbot are
> (the LANG and LC_* environment variables). Of course, the test is also
> buggy so you should open an issue on the tracker.
>

I'm just not sure what to say about it or in what way its being buggy yet,
so can't open an issue :)


> (and the fact that the test doesn't print the actual error message of
> the spawned interpreter is unhelpful)
>

Agreed.

--S
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Build failure in test_cmd_line on OSX-x86

2010-10-08 Thread Stephen Hansen
On Fri, Oct 8, 2010 at 11:09 AM, Stephen Hansen

> wrote:

> On Fri, Oct 8, 2010 at 10:28 AM, Antoine Pitrou wrote:
>
>> On Fri, 8 Oct 2010 10:02:59 -0700
>> Stephen Hansen > wrote:
>> >
>> > And long story short, it gets to 201 and runs test_cmd_line in the same
>> > order as the buildbot did, and it succeeds too, and I curse the gods of
>> the
>> > netherworld, and am stumped with how to proceed. Two separate buildbot
>> runs
>> > and this same failure happened, yet for me, no. Or I'm doing something
>> > differently then the buildbot is, and I can't see what.
>>
>> The buildbot user probably has different locale settings. I can
>> simulate the failure with:
>>
>
> I'd find that very surprising: the buildslave is running as the same user I
> am running the test under, and the LANG is en_US.UTF-8 -- the default.
> Granted, the slave's running under launchd, and so is launching twisted with
> the tac directly -- but I can't see any part of that process which would
> cause the default LANG to change.
>

I edited the launchd config to force LANG = "en_US.UTF-8" and the test
suddenly passes, which is good. I have no idea why the LANG would end up
different when the app is launched from launchd -- even though it was
running as the same user as I was doing the testing against -- but,
apparently, it was.

But, issue4388 and issue9992 seem to already be in, and I commented on them.

Thanks for the help in figuring this out. :)

--Stephen/Ixokai
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Stable build slaves authority

2010-10-13 Thread Stephen Hansen
On 10/13/10 2:47 PM, Antoine Pitrou wrote:
> (you'll notice that we have currently no 64-bit Windows machine although
> 64-bit support under Windows has specific issues)

Provided its not a problem that its a VM, I have a hefty 64-bit Win7
Professional instance that I can put a buildslave on. Despite being a VM
it gets ownership of two cores and 4 gigs of RAM, so should be plenty
fast to handle the load. And I do run it 24/7.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Stable build slaves authority

2010-10-13 Thread Stephen Hansen
On 10/13/10 3:14 PM, "Martin v. Löwis" wrote:
> Am 14.10.2010 00:08, schrieb Stephen Hansen:
>> On 10/13/10 2:47 PM, Antoine Pitrou wrote:
>>> (you'll notice that we have currently no 64-bit Windows machine although
>>> 64-bit support under Windows has specific issues)
>>
>> Provided its not a problem that its a VM, I have a hefty 64-bit Win7
>> Professional instance that I can put a buildslave on. Despite being a VM
>> it gets ownership of two cores and 4 gigs of RAM, so should be plenty
>> fast to handle the load. And I do run it 24/7.
> 
> So far, we didn't have problems with VMs.
> 
> Please be aware that Windows poses its own challenges. Often, builds
> or testsuite runs end up with popup windows, which then hang subsequent
> builds. You often get dozens of them to click away. So operating a
> Windows slave is much more tedious than a Unix one.

Windows always poses its own challenges. :) That's why I have the VM
(and three others for older versions of windows, that just aren't on all
the time like that one is) to begin with*, for testing out my day-job work.

I'll give it a go; I have all the software needed to run the buildbot on
it already besides VC Express, which I'm installing now. If ultimately
it becomes too much of a pain, I'll go back to just providing the mac.
But, I actually have a vested interest in upgrading our Python to 64-bit
in the next few months, so! I'm motivated.

I'll let you know when I have everything installed so you can add a
buildslave account.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Stable build slaves authority

2010-10-13 Thread Stephen Hansen
On 10/13/10 3:42 PM, "Martin v. Löwis" wrote:
>> I'll give it a go; I have all the software needed to run the buildbot on
>> it already besides VC Express, which I'm installing now. If ultimately
>> it becomes too much of a pain, I'll go back to just providing the mac.
>> But, I actually have a vested interest in upgrading our Python to 64-bit
>> in the next few months, so! I'm motivated.
> 
> That won't work, will it? VC Express doesn't come with an AMD64
> compiler (I *think* it's possible to use the SDK one, but this again
> is more complicated).

Oh! Well if it takes a paid version of VS, then I won't be able to do
it. I'll experiment with getting the SDK and using that and seeing if I
can make it work.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Stable build slaves authority

2010-10-13 Thread Stephen Hansen
On 10/13/10 10:28 PM, Jeroen Ruigrok van der Werven wrote:
> -On [20101014 00:55], Brian Curtin ([email protected]) wrote:
>> Correct. There are a few hacky ways to get Express to use the x64 SDK, or so 
>> I
>> read.
> 
> I think Martin meant that you wouldn't need VS Express if you install the
> Windows SDK, since it provides all the tools in the SDK to build Python.

There's mixed signals here, and I'm not sure what they all mean. I have
a Win7-64bit box that I am willing to use to run a buildslave, if its
possible to do so.

#python-dev thought that VS express was all that was needed; then here,
it seemed to me that Martin said that you needed the full version of VS
or perhaps a complex setup with the SDK compiler; but you seem to be
interpreting Martin that the SDK provides everything and nothing else is
needed.

Then again on top of that, my offer may be mooted-- if Brian Curtin is
going to host a x86_64 windows slave then I don't need to worry about
this because its being provided otherwise.

I'm willing to put up with the particular windows-specific difficulties
that go with running a buildslave (especially with David Bolen's AutoIt
scripts which may ease things): but I'm not entirely sure from these
varied results if its even possible or needed.

So, my questions are:
  1. Is someone else (Hi, Brian) providing a 64-bit windows slave, so
there isn't actually any need for me to go through the effort of it?
  2. If not, is all that's needed is the SDK to build 64-bit Python?
  3. Or, does one have to use a combination of VS-Express + the SDK in a
"hack"y way (as some seem to claim, but this last mail seems to indicate
otherwise) to get it done?

Basically, it comes down to: 'it' being a 64-bit windows slave, is it
actually needed from me (i.e., is a more apt expert not providing it),
and can anyone actually say what the requirements are for making it
happen? At the moment I'm uncertain if its even needed or worthwhile to
go through the effort to get the whole visual studio environment set up.

I have computing resources, cycles, and time that's free to offer up:
but the differing responses here makes me unsure if I'm being useful or
not in trying here :)

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] SSH access against buildbot boxes

2010-11-06 Thread Stephen Hansen
On 11/6/10 10:53 AM, Giampaolo Rodolà wrote:
> Personally, I would find this particularly useful for OSX since it's
> one of the few OSes I can't manage to virtualize and which often
> causes me problems.

Although I said this on IRC, I'll repeat the offer to the list for those
not present -- I'm operating the Leopard and Snow Leopard buildslaves,
and although I try to be proactive watching for failures, if someone
wants to test something out before committing they can poke me and I'd
be happy to help.

I can either run a test or two and report back to you, or if you need it
I can open up SSH or even VNC access on a temporary/as-needed basis.
Heck, if you're doing some longer-term work that is more then just
debugging a certain issue and would need access over a longer period of
time, I can probably work something out for you.

I'm just not comfortable opening up such access except on a
person-by-person/case-by-case basis.

I idle on #python-dev as "ixokai" -- you can ping me there and I
generally wake up rather promptly. That, or email works too.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] list.__init__() vs. dict.__init__() behaviour

2006-07-15 Thread Stephen Thorne
Hi,

When testing some 'real world' code using pypy, an inconsistancy with
the way __init__ works between lists and dicts.

The assumption was made when implementing __init__ for pypy that
list.__init__ and dict.__init__ would both wipe the contents of the
objects, but it seems that in cpython, this isn't precisely the case.

>>> l = [2,3]
>>> list.__init__(l)
>>> l
[]

>>> d = {2: 3}
>>> dict.__init__(d)
>>> d
{2: 3}

dict.__init__(mydict) does not wipe the keys. list.__init__(mylist)
wipes the lists contents.

https://codespeak.net/issue/pypy-dev/issue240

Is there a good reason for this behaviour? It has broken my code (a
subclass of dict that populates a key before calling the superclasses
constructer, in the twisted codebase).

-- 
Stephen Thorne

"Give me enough bandwidth and a place to sit and I will move the world."
  --Jonathan Lange
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GeneratorExit inheriting from Exception

2007-03-07 Thread Stephen Warren
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Re: the discussion in:

http://mail.python.org/pipermail/python-dev/2006-March/062823.html

Just as an FYI, the tlslite package (http://trevp.net/tlslite/) got
broken in Python 2.5 and needed the exact fix quoted in the URL above.

It was an easy fix, but the argument isn't hypothetical any more! A
little late to bother changing anything, though.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF5S1Zhk3bo0lNTrURAjIuAKC1ASOfx0L2+hf+3EKa2hktZYRjEgCeNRAn
n395GwS11yM2AMSK67b5oNA=
=+iBp
-END PGP SIGNATURE-
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] splitext('.cshrc')

2007-03-08 Thread Stephen Hansen

I'm a long-term lurker and Python coder, and although I've never really
contributed much to the list, I do make a point to keep up on it so I'm
prepared at least when changes come through. This thread's gone on forever,
so I thought I'd offer my opinion :) Mwha.

Ahem.

First of all, I think the current behavior is clearly broken; ".cshrc" is a
file without an extension and marked as 'hidden' according to the
conventions of the operating system. I totally think it should be fixed; but
with others I'm worried about backwards compatability and more importantly
the possibility of silent failures. Although none of my company's code will
be hit (as I've always done fn.split('.')[-1] just... because it strikes me
as more natural -- then again I'm in a situation where I don't have
user-supplied filenames.), the thought that it's OK to make such changes
even in a 'major' release is a bit disconcerting.

Its not that I don't think there can be backwards-incompatible changes, but
if at all possible they should be done in such a way that the change causes
a hard failure or at least a clear warning in the offending code. I read
that someone (... No idea who) suggested an optional keyword argument, and
someone else objected to that on the grounds that it would let a second
argument be passed in to alter the signature, and it would no longer throw
an exception as people would be expecting.

Well, I think it was a great idea-- whoever said it :) And gives the
oppertunity to use the transitory period before 3.0 to loudly warn people
about this fix. I don't expect a lot of people will be hit by it, but isn't
that why this whole 2.6-to-3.0 thing is going on?

Why wouldn't this work? I could submit a patch with a doc modification and
tests even :P But it could begin the process of 'fixing' it, and warn people
of the upcoming breakage, and although it slightly complicates the
function... I think it only does it slightly :)

(BTW, it raises a TypeError if the allow_dotfile isn't specified
specifically, to address someone's objection that it would alter the
signature)

-

import warnings
def splitext(p, **kwargs):
   allow_dotfile = kwargs.pop('allow_dotfile', False)

   if kwargs:
   raise TypeError, "splitext() takes at most 2 arguments (%s given)" %
(1 + len(kwargs))

   i = p.rfind('.')
   if i<=max(p.rfind('/'), p.rfind('\\')):
   fn, ext = p, ''
   else:
   fn, ext = p[:i], p[i:]

   if allow_dotfile is False:
   if p.find('.') == 0:
   warnings.warn(FutureWarning('In Python 3.0, allow_dotfile
default will become True.'))
   return fn, ext
   else:
   if p.find('.') == 0:
   return ext, ''
   else:
   return fn, ext

-
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal to revert r54204 (splitext change)

2007-03-15 Thread Stephen Hansen

For example, I committed a fix for urllib that made it raise IOError
instead
of an AttributeError (which wasn't explicitly raised, of course) if a
certain
error condition occurs.

This is changed behavior too, but if we are to postpone all these fixes
to 3.0, we won't have half of the fixes in Python 2.6 that are there now.



There's a big difference between that change and this one; that change is
'loud'. It makes noise. It's raising an exception: that exception will
either be handled or will propagate up the stack and be noticed somewhere.

I *think* (ahem.. I read minds...) the problem people are having with this
particular change is the fact that the behavior of this function is being
changed in a way that is completely silent. Code written to expect one kind
of result are now getting a different kind of result... instead of having an
error thrown, a warning given, or something explicit... it's just different
now.

And it'd be so easy to do it in a way which wouldn't be silent... just throw
out a warning, and defer the actual change until the next release.

Expecting people to keep on top of Misc/NEWS and re-read the documentation
for every function in their code is a tad unreasonable. I don't personally
find it unreasonable for people to ask for a bit more of an extended
migration path when changes that are being implemented will cause *silent*
changes in behavior.

It's been very hard for my company to move from 2.3 to 2.4 as a development
platform as it is, which we're just barely doing now... for this reason I'm
paying a lot more attention to -dev lately to be prepared for 2.6 and
beyond. Not everyone has the time to do that.. there's a lot of messages :)
And Misc/NEWS is *huge*. Warnings are a very useful mechanism for
semi-painless migrations and upgrades...

(And, if I thought it'd have any chance of going in, I'd submit a patch to
add a warning and adjust docs/tests/etc... but this issue seems ever so
divided...)

--S
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal to revert r54204 (splitext change)

2007-03-15 Thread Stephen Hansen

For anyone who is interested, I've submitted a patch (source + docs + tests)
to SF as 1681842, which re-establishes the previous behavior, but adds a
keyword argument to obtain the new behavior and a warning promising the new
behavior will become default in the future.

...which would be my second contribution ever. And the first one to be more
then a line and a half :P

--
Stephen Hansen
Development
Advanced Prepress Technology

[EMAIL PROTECTED]
(818) 748-9282
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal to revert r54204 (splitext change)

2007-03-16 Thread Stephen Hansen

That may actually be a genuinely useful approach:

splitext(name, ignore_leading_dot=False, all_ext=False)



... that's perfect.  I updated my patch to do it that way! :)

--S
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Get 2.5 changes in now, branch will be frozen soon

2007-03-31 Thread Stephen Hansen

I'm sure everyone remembers the big ol' honking discussion on the change to
os.splitext; it sorta fizzled after Guido asked if people would accept a
pronouncement on the subject. I'm not anyone in the Python world, but felt
strongly enough on the particular subject to submit a patch (and later
revisions based upon the evolving conversation), but I think this should get
resolved one way or the other. If the existing "fix" goes in for 2.5.1, then
I'm going to withdraw said patch since I think it would end up doing more
harm then good to silently change the semantics of how the function treats
and defines extensions in one version only to do so again in another.

The options, from how I see it, are:
 * The patch that Martin committed to fix the behavior such that
splitext('.cshrc') returned ('.cshrc', '') remains. This would silently
alter the behavior of a function whose tests have held the existing behavior
correct; but it would make the behavior more logical in many situations.
 * The change gets simply reverted for now to return to the previous
behavior, to perhaps be addressed later. This would revert to the status quo
and its definition of 'extension' when it comes to files that have a leading
dot, which may not be long-term desirable but which would at least give time
for a plan and/or decision on the issue without a silent behavior
adjustment.
 * The introduction of a keyword parameter to determine if the function
should treat that initial dot as the start of an extension or a hidden
filename. This would (IMHO) provide a migration path from the existing
behavior which is a bit odd to the new behavior which makes more sense,
while still allowing people to use whichever they choose to be appropriate
for their domain. The question is also open on the issue of warnings; there
was some sentiment that opposed a warning in this case (or in general)

With the latter, another question is what the default is now; and will it
change in the future.

I just wanted to offer a gentle prod to see if a decision can be made; if
any decision requires an adjustment to patches, tests and documentation, I'm
willing to do them. Whatever the decision ends up being.

On 3/29/07, Neal Norwitz <[EMAIL PROTECTED]> wrote:


This is a reminder that the 2.5 branch will be frozen early next week.
If there are changes you want to get into 2.5.1, they should be
checked in within a few days.  Be conservative!  There will be a
2.5.2, it's better to wait than to have to make a new release for one
rushed feature.  If you don't believe, just wait until Anthony shows
up at your doorstep. :-)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Get 2.5 changes in now, branch will be frozen soon

2007-03-31 Thread Stephen Hansen

Anthony Baxter said that the patch wasn't making it into 2.5.1, and
since he is the release manager, his word is just about as final as
Guido's (at least regarding the releases he does).



Ah, oops! Work got busy, and I must have missed that in the Endless Threads.

Nevermind then. :)

--S
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 11: Dropping support for ten year old systems

2010-12-06 Thread Stephen Hansen
On 12/6/10 10:55 AM, "Martin v. Löwis" wrote:
> Of course, with these old systems, I really wonder: why do they need
> current Python releases? 2.7 will remain available and maintained for
> some time, and 3.1 will at least see security fixes for some more time -
> something that the base system itself doesn't receive anymore. So
> if you needed a Python release for Solaris 8, you could just use Python
> 2.3, no? We are not going to take the sources of old releases offline.

For things like Solaris 8, I have no thoughts one way or the other-- but
considering Windows XP is hitting 10 years next year-- Personally? My
entirely theoretical timetable for when I think I'll be able to finally
upgrade to Python 3 (where I'll skip to whatever the latest Python 3
is), is actually shorter by at least a few years then my timetable of
when I think I'll be able to drop support of Windows XP. Unfortunately.

WinXP is old but *pervasive* still in my experience with small
businesses / customers. Many aren't even considering making a plan for
Win7 yet.

So if two years rolls around and Python 3.x (where 'x' is 'whatever is
current') isn't supported on Windows XP, I'll be very sad, and will have
to be stuck on Python 3.x-1 for .. awhile, where "awhile" is out of my
control and up to the Masses who are unable or can't be bothered with
fixing what works for them w/ WinXP.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Make test failed issues for phyton 3.2 on centos5.5

2011-04-10 Thread Stephen Yeng
Hello phython team,
I am new to install phyton on Centos5.5
Hope you can help on this issues below when I make test

5 tests failed:
test_argparse test_distutils test_httpservers test_import
test_zipfile
31 tests skipped:
test_bz2 test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp
test_codecmaps_kr test_codecmaps_tw test_curses test_dbm_gnu
test_dbm_ndbm test_gdb test_gzip test_kqueue test_ossaudiodev
test_readline test_smtpnet test_socketserver test_sqlite test_ssl
test_startfile test_tcl test_timeout test_tk test_ttk_guionly
test_ttk_textonly test_urllib2net test_urllibnet test_winreg
test_winsound test_xmlrpc_net test_zipfile64 test_zlib
11 skips unexpected on linux2:
test_bz2 test_dbm_gnu test_dbm_ndbm test_gzip test_readline
test_ssl test_tcl test_tk test_ttk_guionly test_ttk_textonly
test_zlib


I will post the shortest failed test('test_zip') and if you all allowed me
post full log of the 5 failed test I will do it.

== CPython 3.2 (r32:88445, Apr 10 2011, 11:18:27) [GCC 4.1.2 20080704 (Red
Hat 4.1.2-50)]
==   Linux-2.6.18-238.5.1.el5-i686-athlon-with-redhat-5.6-Final
little-endian
==   /tmp/Python-3.2/build/test_python_6187
Testing with flags: sys.flags(debug=0, division_warning=0, inspect=0,
interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0,
ignore_environment=0, verbose=0, bytes_warning=0, quiet=0)
[1/1] test_zipfiles
test test_zipfiles crashed -- : No module named
test_zipfiles
Traceback (most recent call last):
  File "/tmp/Python-3.2/Lib/test/regrtest.py", line 962, in runtest_inner
the_package = __import__(abstest, globals(), locals(), [])
ImportError: No module named test_zipfiles
1 test failed:
test_zipfiles

How should I fix the 5 failed test above? Please help me on that, thanks
you.

-- 
If you have any other question about your web portal please contact me. At
N-Pinokyo we value our customers and will be more than happy to assist you
with any other matter related to our service.

Regards,
Stephen Yeng
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make test failed issues for phyton 3.2 on centos5.5

2011-04-11 Thread Stephen Yeng
Hello,
Thanks for the reply.
This the once of the test I fail, hope you can help so I can fix the rest 4
errors. :)
--
Ran 90 tests in 9.191s

FAILED (errors=1, skipped=25)
test test_zipfile failed -- Traceback (most recent call last):
  File "/tmp/Python-3.2/Lib/test/test_zipfile.py", line 497, in
test_unicode_filenames
zipfp.open(name).close()
  File "/tmp/Python-3.2/Lib/zipfile.py", line 978, in open
close_fileobj=not self._filePassed)
  File "/tmp/Python-3.2/Lib/zipfile.py", line 487, in __init__
self._decompressor = zlib.decompressobj(-15)
AttributeError: 'NoneType' object has no attribute 'decompressobj'

1 test failed:
test_zipfile


On Mon, Apr 11, 2011 at 4:14 PM, Victor Stinner wrote:

> > [1/1] test_zipfiles
> > test test_zipfiles crashed -- : No module named
> > test_zipfiles
>
> It means that you don't have a module named test_zipfiles. Retry with
> "test_zipfile" :-)
>
> You may open an issue (including details) for your failures.
>
> Victor
>



-- 
If you have any other question about your web portal please contact me. At
N-Pinokyo we value our customers and will be more than happy to assist you
with any other matter related to our service.

Regards,
Stephen Yeng
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks

2007-06-23 Thread Stephen Hansen

The kind of errors I mentioned ("permission denied" errors that

seem to occur without an obvious reason) have cost me at least
two weeks of debugging the hard way (with ProcessExplorer etc)
and caused my manager to loose his trust in Python at all...
I think it is well worth the effort to keep this trouble away from
the Python programmers if possible.

And throughout the standard library modules, "open" is used,
causing these problems as soon as sub-processes come into play.

Apart from shutil.copyfile, other examples of using open that can cause
trouble are in socket.py (tell me any good reason why socket handles
should be inherited to child processes) and even in logging.py.

For example, I used RotatingFileHandler for logging my daemon
program activity. Sometimes, the logging  itself caused errors,
when a still-running child process had inherited the log file handle
and log rotation occured.



I just wanted to express to the group at large that these experiences aren't
just Henning's; we spent a *tremendous* amount of time and effort debugging
serious problems that arose from file handles getting shared to subprocesses
where it wasn't really expected. Specifically, the RotatingFileHandler
example above. It blatantly just breaks when subprocesses are used and its
an extremely obtuse process to discover why.

It was very costly to the company because it came up at a bad time and was
*so* obtuse of an error. At first it looked like some sort of thread-safety
problem, so a lot of prying went into that before we got stumped... after
all, we *knew* no other process touched that file, and the logging module
(and RotatingFileHandler) claimed and looked thread-safe, so.. how could it
be having a Permission Denied error when it very clearly is closing the file
before rotating it? Eventually the culprit was found, but it was very
painful.

A couple similar issues have arisen since, and they're only slightly easier
to debug once you are expecting it. But the fact that the simple and obvious
features provided in the stdlib break as a result of you launching a
subprocess at some point sorta sucks :)

So, yeah. Anything even remotely or vaguely approaching Henning's patch
would be really, really appreciated.

--SH
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [poll] New name for __builtins__

2007-11-28 Thread Stephen Hansen
(The lurker awakes...)

> > If not that I suggest something like __inject_builtins__.  This
> > implies it's a command to eval/exec, and doesn't necessarily reflect
> > your current builtins (which are canonically accessible as an
> > attribute of your frame.)
>
> You're misunderstanding the reason why __builtins__ exists at all. It
> is used *everywhere* as the root namespace, not just as a special case
> to inject different builtins.
>
> ATM I'm torn between __root__ and __python__.


Something with the word "global" speaks to it's real effect, except that the
word already has an established meaning in Python as being 'global to the
module level', and modifying __builtins__ lets you be "global to the entire
universe of that instance"

So I would humbly suggest __universal__. The names within are available
everywhere. 'root' speaks to me too much of trees, and while namespaces may
be tree-like, __root__ alone doesn't say "root namespace"... and
__root_namespace__ is long.

(Then again, long for a feature that should only be used with care isn't a
bad thing)

--Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] rfc822_escape doing the right thing?

2008-01-23 Thread stephen emslie
I've been working on a project that renders PKG-INFO metadata in a
number of ways. I have noticed that fields with any indentation were
flattened out, which is being done in distutils.util.rfc822_escape.
This unfortunately means that you cant use reStructuredText formatting
in your long description (suggested in PEP345), or are limited to a
set that doesn't require indentation (no block quotes, etc.).

It looks like this behavior was intentionally added in  rev 20099, but
that was about 7 years ago - before reStructuredText and eggs. I
wonder if it makes sense to re-think that implementation with this
sort of metadata in mind, assuming this behavior isn't required to be
rfc822 compliant. I think it would certainly be a shame to miss out on
a good thing like proper (renderable) reST in our metadata.

A quick example of what I mean:

>>> rest = """
... a literal python block::
... >>> import this
... """
>>> print distutils.util.rfc822_escape(rest)

a literal python block::
>>> import this

should look something like:

a literal python block::
>>> import this


Is distutils being over-cautious in flattening out all whitespace? A
w3c discussion on multiple lines in rfc822 [1] seems to suggest that
whitespace can be 'unfolded' safely, so it seems a shame to be
throwing it away when it can have important meaning.

[1] http://www.w3.org/Protocols/rfc822/3_Lexical.html

Thanks for any comments

Stephen Emslie
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] rfc822_escape doing the right thing?

2008-01-24 Thread stephen emslie
I have created issue #1923 to keep track of this.

Stephen Emslie

On Jan 23, 2008 6:00 PM, Gregory P. Smith <[EMAIL PROTECTED]> wrote:
> could you put this on bugs.python.org and follow up with a reference to the
> issue # for better tracking?
>
>
>
> On 1/23/08, stephen emslie <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > I've been working on a project that renders PKG-INFO metadata in a
> > number of ways. I have noticed that fields with any indentation were
> > flattened out, which is being done in distutils.util.rfc822_escape.
> > This unfortunately means that you cant use reStructuredText formatting
> > in your long description (suggested in PEP345), or are limited to a
> > set that doesn't require indentation (no block quotes, etc.).
> >
> > It looks like this behavior was intentionally added in  rev 20099, but
> > that was about 7 years ago - before reStructuredText and eggs. I
> > wonder if it makes sense to re-think that implementation with this
> > sort of metadata in mind, assuming this behavior isn't required to be
> > rfc822 compliant. I think it would certainly be a shame to miss out on
> > a good thing like proper (renderable) reST in our metadata.
> >
> > A quick example of what I mean:
> >
> > >>> rest = """
> > ... a literal python block::
> > ... >>> import this
> > ... """
> > >>> print distutils.util.rfc822_escape(rest)
> >
> > a literal python block::
> > >>> import this
> >
> > should look something like:
> >
> > a literal python block::
> > >>> import this
> >
> >
> > Is distutils being over-cautious in flattening out all whitespace? A
> > w3c discussion on multiple lines in rfc822 [1] seems to suggest that
> > whitespace can be 'unfolded' safely, so it seems a shame to be
> > throwing it away when it can have important meaning.
> >
> > [1] http://www.w3.org/Protocols/rfc822/3_Lexical.html
> >
> > Thanks for any comments
> >
> > Stephen Emslie
> > ___
> > Python-Dev mailing list
> > [email protected]
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/greg%40krypto.org
> >
>
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Distutils] How we can get rid of eggs for 2.6 and beyond

2008-04-03 Thread Stephen Waterbury
Phillip J. Eby wrote:
> ... if tools exist and are distributed for such a [PEP 262] "database", 
> and *everybody* agrees to use it as an officially-blessed standard, 
> then it should be possible for setuptools to co-exist with that 
> framework, and we're all happy campers.

I like this idea and the 3 items proposed to accomplish it.

> 2. Update or replace the implementation as appropriate  ...

After some googling and digging around, I found:



Is that what you meant by "the implementation"?

> Questions, comments...  volunteers?   :)

I'll try to help, if this is agreed to and if I'm able.

Steve
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] how to easily consume just the parts of eggs that are good for you

2008-04-10 Thread Stephen Hansen
>
>  > > IMHO, the main system without a package manager is Windows.
>  >
>  > AFAICT the MacOS platform also lacks in this area.
>
> Actually, they both have them.  Windows has Cygwin (rpm-based), while
> for MacOS Fink (deb-based), MacPorts (FreeBSD ports-like), and
> NetBSD's pkgsrc are all viable options if you want packaging support
> for 3rd-party packages.
>

Er, excuse me for cutting in but-- that's just not at all the same thing.

For people who are using a Red Had derivative, or a Debian derivative, or ..
whatever .. the package manager isn't something they go out of their way to
add and then have to struggle with. It's simply *how* their world works. By
a large measure, everything they want is there, managed by their package
management system.

For those users, I understand well that they don't want some Python package
management system to be a second system that they have to deal with.

But I'm sorry: the world is bigger then Linux and such things.

I'm a mac user, who has had extensive experience in Linux; but on the mac?
Fink and MacPorts added on top of MacOSX is not even remotely comparable to
using an operating system which has a standard package manager that is a
part of every users daily life. My operating system is Unixy, and comes
pre-installed with a number of things, including Python, wxWidgets, sqlite--
many things that make the programs I make for my customers easy to use.

But for those products that are *not* available standard, where are my mac
customers left? Their options are to install something like Fink, or
MacPorts... and then we come into issues of it wanting to install its *own*
version of python, or its *own* version of these third party things, on top
of what's already there? The alternative is that users have to install,
manually, these third-party requirements themselves.

I've found that it is in general far easier for me to just download and
install stuff manually then to rely on these "Add-on" package managers. At
least if I'm thinking of providing as minimally and least intrusive as
possible experience for my users to install my product.

Power users, especially those familiar with the Linux world, may relish in
the existence of MacPorts and Fink... Regular people, even IT managers of
companies I have to deal with-- will not.

I love easy_install/setuptools because it lets me get my *Python*
applications and products out to people, regardless of OS, in a way that
just *works*.

I do think its valuable to do so in a way that will integrate with native
package managers on those operating systems that they are a native and
integrated part of -- but to say, "Let's not re-create apt!" is a sorrowful
stance. It's saying, "Screw Windows, because it isn't as good as what we
have." and "Screw Mac, because its not as good as we have." Or even, "Screw
the people who aren't power users and are just not going to be able to go
through the effort of adding *on* a non-standard package management system
to their operating system."

There's a whole wide world out there that simply does *not* have a
"package management"(*) system.

Python is beautiful, making Python programs is blissful. I'd be far, far
more concerned with making it easy to distribute Python-based programs to
*any* operating system then I would be concerned with partially redoing what
a *minority* of systems out there have done to make package management (with
dependencies and all) easy for its users.

Python is a cross-platform development environment. Let's not forget that
most people just... don't have Linux... and don't have the equally blissful
world of apt or rpm available to them natively.

Its very cool to *integrate* -- to make a way for those RPM and
DEB distributors to deliver an app in their own way that will satisfy their
needs. But what about the people *without* that native capability? Having a
Python-only distribution/management system like easy_install is a *huge*
boon to getting real products to real people.

I think PJE's idea here is very good. Just include certain files and such in
the RPM/DEB that will satisfy the "python-package-management" system. For
RPM/DEB users and their OS's database of packages, its irrelevant largely--
they'll still keep using their own system. But if a product needs something
without a .deb or .rpm, or if someone's on an operating system without a
native system-- they can still gather everything they need.

Anyways.

My 2 cents.

--Stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: add odict to collections

2008-06-15 Thread Stephen Hansen
> But my point is that we we need to focus on finding real use cases and
> exploring how best to solve them.  Otherwise we might as well throw in
> my OrderedSet[1] as-is, despite that it's got no comments and no
> ratings.  Even I don't seem to use it.
>

I'm mostly lurking on these threads, so as to be semi-prepared when new
versions come out.. in fifty years, since we *just* migrated from 2.3 to 2.4
on our product, so. :)

Anyways, we've had an OrderedDictionary sort of implementation in our
library for eons. The product is commercial and a mix between a desktop
application and a web one, with an application server and cross-platform
availability... so there's a slightly bizarre and wide range of uses that
we've found for putting ordered dictionaries to. If some various use-cases
and our reasoning helps, here it is. If not, ignore :)

- One is that the system is modular, with various parts able to be activated
or deactivated in configuration. The order of module load is important, as
we've never quite bothered to put in automatic dependency checking -- that's
just overboard for us. Further, the modules can't really be referred to each
other via "import" or in code, but instead need to go through a layer of
indirection through a name-- so-- the system maintains an ordered dict of
modules, a la sys.modules, with the order determining load when it goes
through to initialize itself.

- There's several more places with a similar pattern; a mapping between
"Component Name" and "Module" for generic systems. A processing host which
is meant to be able to load and run any kind of service or task.

- There's several places where a user is configuring a slightly complex set
of actions-- he gives these actions a name, which is shown to the user, and
then we have the configuration options itself we use if that is chosen. Its
just natural/easy to store that in an ordered dict after we pull it out of
the DB, as we want its order to be whatever the user chooses in their setup,
and the key->value association is clear.

- In a modular application with a generic outer interface(meaning the main
app can't even fathom what all it might be asked to load), there's things
like a "Windows" menu item that's easily stored internally as a mapping
between window names and the window object itself, so the menu can be
readily re-generated at will and the window found to switch to.

- In fact, we use a lot of OrderedDictionaries as a part of that UI to
data/configuration mapping, come to think of it. We know the order of
"fields" that someone can search on in the database in advance, and have
them written into the searchUI code as an ordered dict because it just works
nicely. The keys() become a drop-down list, the value a structure
identifying to the central server what field it is they're searching on.

- Fiiinally (sorta), we find passing ordered dictionaries to our Genshi web
templating layer very lovely for making easy-designable web templates for
the web client. We even let customers edit them sometimes!

Basically, after looking at all of these, my impressions of an "ordered
dictionary" for the various use cases we use are:

- The ordered dictionary is used almost exclusively in situations where we
are getting the order implicitly from some other source. Be it a SQL query
(with its own ORDER BY statements), a configuration option, the order of
lines in a file, an auto-sequenced table, or hard-coded data Thus, we've
always found "insertion order" to be important.

- Much to my surprise, we actually aren't ever using an ordered dictionary
in a situation where the value ends up being modified after the dictionary
is loaded.

- The only time we use dictionaries where we are updating them after the
fact and their order is -expected- to change is when we are using a *sorted*
dictionary.

- As such, I'd be quite surprised if I was updating the value of an ordered
dictionary and it were to change its order. Meaning:

  >>> d = odict()
  >>> d["hello"] = 1
  >>> d["there"] = 2
  >>> d["hello"] = 3
  >>> d.keys()
  ['hello', 'there']

And not: ['there', 'hello']

An ordered dictionary that does not simply preserve initial insertion order
strikes me as a *sorted* dictionary-- sorting on insertion time. I'd expect
a sorted dictionary to shift itself around as appropriate. I'd not expect an
ordered dictionary to change the order without some explicit action.

To me, "ordered dictionary" is in fact a *preordered* dictionary. The order
is defined before the data in, and the dictionary's job is to just preserve
it.

Anyways. No idea if that'll help the discussion, but a couple people kept
talking about use cases :

[Python-Dev] How to install tile (or any other tcl module)

2004-12-09 Thread Stephen Kennedy

I've been trying to get Tile to work with python.
It can make your tkinter apps look like
http://tktable.sourceforge.net/tile/screenshots/demo-alt-unix.png
See http://tktable.sourceforge.net/tile/

Under linux I built tile from source, installed and it just works.

import Tkinter
root = Tkinter.Tk()
root.tk.call('package', 'require', 'tile')
root.tk.call('namespace', 'import', '-force', 'ttk::*')
root.tk.call('tile::setTheme', 'alt')
### Widgets are now pretty!

Under win32, I installed the binary package into python/tcl
(i.e. python/tcl/tile0.5) with all the other tcl packages, but tcl
can't find it. Any ideas?

Traceback (most recent call last):
  File "Script1.py", line 5, in ?
root.tk.call('package', 'require', 'tile')
_tkinter.TclError: can't find package tile

Stephen.

___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Re: Zen of Python

2005-01-19 Thread Stephen Thorne
On Wed, 19 Jan 2005 19:03:25 -0500, Timothy Fitz <[EMAIL PROTECTED]> wrote:
> On Thu, 20 Jan 2005 09:03:30 +1000, Stephen Thorne
> <[EMAIL PROTECTED]> wrote:
> > "Flat is better than nested" has one foot in concise powerful
> > programming, the other foot in optimisation.
> >
> > foo.bar.baz.arr involves 4 hashtable lookups. arr is just one hashtable 
> > lookup.
> 
> I find it amazingly hard to believe that this is implying optimization
> over functionality or clarity. There has to be another reason, yet I
> can't think of any.

What I meant to say was, 'flat is better than nested' allows you to
write more concise code, while also writing faster code.

Stephen.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] string_join overrides TypeError exception thrown in generator

2005-08-14 Thread Stephen Thorne
Hi,

An interesting problem was pointed out to me, which I have distilled
to this testcase:
def gen():
 raise TypeError, "I am a TypeError"
 yield 1

def one(): return ''.join( x for x in gen() )
def two(): return ''.join([x for x in gen()])

for x in one, two:
try:
 x()
except TypeError, e:
 print e

Expected output is:
"""
I am a TypeError
I am a TypeError
"""

Actual output is:
"""
sequence expected, generator found
I am a TypeError
"""

Upon looking at the implementation of 'string_join' in
stringobject.c[1], It's quite obvious what's gone wrong, an exception
has been triggered in PySequence_Fast, and string_join overrides that
exception, assuming that the only TypeErrors thrown by PySequence_Fast
are caused by 'orig' being a value that was an invalid sequence type,
ignoring the possibility that a TypeError could be thrown by
exhausting a generator.

seq = PySequence_Fast(orig, "");
if (seq == NULL) {
if (PyErr_ExceptionMatches(PyExc_TypeError))
PyErr_Format(PyExc_TypeError,
 "sequence expected, %.80s found",
 orig->ob_type->tp_name);
return NULL;
}

I can't see an obvious solution, but perhaps generators should get
special treatment regardless. Reading over this code it looks like the
generator is exhausted all at once, instead of incrementally..
-- 
Stephen Thorne
Development Engineer

[1] 
http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Objects/stringobject.c?rev=2.231&view=markup
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Buildbot-devel] Re: buildbot

2006-01-11 Thread Stephen Davis
> The reason I want static pages is for security concerns. It is not
> easy whether buildbot can be trusted to have no security flaws,
> which might allow people to start new processes on the master,
> or (perhaps worse) on any of the slaves.

I have security concerns as well, but not in buildbot itself.  My  
project is restricted even withinz the company I work for so I need  
the buildbot web server to only be available to certain people.   
HTTPS access would be nice too.  TwistedWeb doesn't seem to have  
support for either HTTPS or authentication so I've been forced to  
"hide" it by putting it on a non-standard port.  Very weak.

I am no networking expert so the suggestions for using a reverse  
proxy are very welcome and I will look into that right away.  Just  
wanted to add my voice to the security concerns.

stephen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Stephen J. Turnbull
Andreas Maier writes:

 > The problem of the default implementation is that "x is not y"
 > implies "x != y" and that may or may not be true under a sensible
 > definition of equality.

I noticed this a long time ago and just decided it was covered by
"consenting adults".  That is, if the "sensible definition" of x == y
is such that it can be true simultaneously with x != y, it's the
programmer's responsibility to notice that, and to provide an
implementation.  But there's no issue that lack of an explicit
implementation of comparison causes a program to have ambiguous
meaning.

I also consider that for "every object has a value" to make sense as a
description of Python, that value must be representable by an object.
The obvious default representation for the value of any object is the
object itself!

Now, for this purpose you don't need a "canonical representation" of
an object's value.  In particular, equality comparisons need not
explicitly construct a representative object.  Some do, some don't, I
would suppose.  For example, in comparing an integer with a float, I
would convert the integer to float and compare, but in comparing float
and complex I would check the complex for x.im == 0.0, and if true,
return the value of x.re == y.

I'm not sure how you interpret "value" to find the behavior of Python
(the default comparison) problematic.  I suspect you'd have a hard
time coming up with an interpretation consistent with Python's object
orientation.

That said, it's probably worth documenting, but I don't know how much
of the above should be introduced into the documentation.

Steve

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Stephen J. Turnbull
Andreas Maier writes:

 > A class designer can directly implement what equality means to the
 > class, but he or she cannot implement an accessor method for the
 > value.

Of course she can!  What you mean to say, I think, is that Python does
not insist on an accessor method for the value.  Ie, there is no dunder
method __value__ on instances of class object.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Stephen J. Turnbull
Ethan Furman writes:

 > And what would be this 'sensible definition' [of value equality]?

I think that's the wrong question.  I suppose Andreas's point is that
when the programmer doesn't provide a definition, there is no such
thing as a "sensible definition" to default to.  I disagree, but given
that as the point of discussion, asking what the definition is, is moot.

 > 2) The 'is' operator is specialized, and should only rarely be
 >needed.

Nitpick: Except that it's the preferred way to express identity with
singletons, AFAIK.  ("if x is None: ...", not "if x == None: ...".)

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Stephen J. Turnbull
Rob Cliffe writes:

 > > Why? What value (pun intended) is there in adding an explicit statement
 > > of value to every single class?

 > It troubles me a bit that "value" seems to be a fuzzy concept - it has 
 > an obvious meaning for some types (int, float, list etc.) but for 
 > callable objects you tell me that their value is the object itself,

Value is *abstract* and implicit, but not fuzzy: it's what you compare
when you test for equality.  It's abstract in the sense that "inside
of Python" an object's value has to be an object (everything is an
object).  Now, the question is "do we need a canonical representation
of objects' values?"  Ie, do we need a mapping from from every object
conceivable within Python to a specific object that is its value?
Since Python generally allows, even prefers, duck-typing, the answer
presumably is "no".  (Maybe you can think of Python programs you'd
like to write where the answer is "yes", but I don't have any
examples.)  And in fact there is no such mapping in Python.

So the answer I propose is that an object's value needs a
representation in Python, but that representation doesn't need to be
unique.  Any object is a representation of its own value, and if you
need two different objects to be equal to each other, you must define
their __eq__ methods to produce that result.

This (the fact that any object represents its value, and so can be
used as "the" standard of comparison for that value) is why it's so
important that equality be reflexive, symmetric, and transitive, and
why we really want to be careful about creating objects like NaN whose
definition is "my value isn't a value", and therefore "a = float('NaN');
a == a" evaluates to False.

I agree with Steven d'A that this rule is not part of the language
definition and shouldn't be, but it's the rule of thumb I find hardest
to imagine *ever* wanting to break in my own code (although I sort of
understand why the IEEE 754 committee found they had to).

 > How can we say if an object is mutable if we don't know what its
 > value is?

Mutability is a different question.  You can define a class whose
instances have mutable attributes but are nonetheless all compare
equal regardless of the contents of those attributes.

OTOH, the test for mutability to try to mutate it.  If that doesn't
raise, it's mutable.

Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Stephen J. Turnbull
Chris Angelico writes:

 > The reason NaN isn't equal to itself is because there are X bit
 > patterns representing NaN, but an infinite number of possible
 > non-numbers that could result from a calculation.

I understand that.  But you're missing at least two alternatives that
involve raising on some calculations involving NaN, as well as the
fact that forcing inequality of two NaNs produced by equivalent
calculations is arguably just as wrong as allowing equality of two
NaNs produced by the different calculations.  That's where things get
fuzzy for me -- in Python I would expect that preserving invariants
would be more important than computational efficiency, but evidently
it's not.  I assume that I would have a better grasp on why Python
chose to go this way rather than that if I understood IEEE 754 better.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Stephen J. Turnbull
Steven D'Aprano writes:

 > I don't think so. Floating point == represents *numeric* equality,

There is no such thing as floating point == in Python.  You can apply
== to two floating point numbers, but == (at the language level)
handles any two numbers, as well as pairs of things that aren't
numbers in the Python language.  So it's a design decision to include
NaNs at all, and another design decision to follow IEEE in giving them
behavior that violates the definition of equivalence relation for ==.

 > In an early post, you suggested that NANs don't have a value, or that 
 > they have a value which is not a value. I don't think that's a good way 
 > to look at it. I think the obvious way to think of it is that NAN's 
 > value is Not A Number, exactly like it says on the box. Now, if 
 > something is not a number, obviously you cannot compare it numerically:

And if Python can't do something you ask it to do, it raises an
exception.  Why should this be different?  Obviously, it's question of
expedience.

 > I'm not sure what you're referring to here. Is it that containers such 
 > as lists and dicts are permitted to optimize equality tests with 
 > identity tests for speed?

No, when I say I'm fuzzy I'm referring to the fact that although I
understand the logical rationale for IEEE 754 NaN behavior, I don't
really understand the ins and outs well enough to judge for myself
whether it's a good idea for Python to follow that model and turn ==
into something that is not an equivalence relation.

I'm not going to argue for a change, I just want to know where I stand.

 > Basically, and I realise that many people disagree with their decision 
 > (notably Bertrand Meyer of Eiffel fame, and our own Mark
 > Dickenson),

Indeed.  So "it's the standard" does not mean there is a consensus of
experts.  I'm willing to delegate to a consensus of expert opinion,
but not when some prominent local expert(s) disagree -- then I'd like
to understand well enough to come to my own conclusions.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-09 Thread Stephen J. Turnbull
Alexander Belopolsky writes:

 > Why have builtin sum at all if its use comes with so many caveats?

Because we already have it.  If the caveats had been known when it was
introduced, maybe it wouldn't have been.  The question is whether you
can convince python-dev that it's worth changing the definition of
sum().  IMO that's going to be very hard to do.  All the suggestions
I've seen so far are (IMHO, YMMV) just as ugly as the present
situation.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-10 Thread Stephen J. Turnbull
Alexander Belopolsky writes:
 > On Sat, Aug 9, 2014 at 3:08 AM, Stephen J. Turnbull 
 > wrote:
 > 
 > > All the suggestions
 > > I've seen so far are (IMHO, YMMV) just as ugly as the present
 > > situation.
 > >
 > 
 > What is ugly about allowing strings?  CPython certainly has a way to to
 > make sum(x, '')

sum(it, '') itself is ugly.  As I say, YMMV, but in general last I
heard arguments that are usually constants drawn from a small set of
constants are considered un-Pythonic; a separate function to express
that case is preferred.  I like the separate function style.

And that's the current situation, except that in the case of strings
it turns out to be useful to allow for "sums" that have "glue" at the
joints, so it's spelled as a string method rather than a builtin: eg,
", ".join(paramlist).

Actually ... if I were a fan of the "".join() idiom, I'd seriously
propose 0.sum(numeric_iterable) as the RightThang{tm].  Then we could
deprecate "".join(string_iterable) in favor of "".sum(string_iterable)
(with the same efficient semantics).

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] class Foo(object) vs class Foo: should be clearly explained in python 2 and 3 doc

2014-08-10 Thread Stephen J. Turnbull
Chris Angelico writes:

 > The justification is illogical. However, I personally believe
 > boilerplate should be omitted where possible;

But it mostly can't be omitted.  I wrote 22 classes (all trivial)
yesterday for a Python 3 program.  Not one derived directly from
object.  That's a bit unusual, but in the three longish scripts I have
to hand, not one had more than 30% "new" classes derived from object.

As a matter of personal style, I don't use optional positional
arguments (with a few "traditional" exceptions); if I omit one most of
the time, when I need it I use a keyword.  That's not an argument,
it's just an observation that's consistent with support for using
an explicit parent class of object "most of the time".

 > that's why we have a whole lot of things that "just work". Why does
 > Python not have explicit boolification for if/while checks?

Because it does have explicit boolification (signaled by the control
structure syntax itself).  No?  I don't think this is less explicit
than REXX, because it doesn't happen elsewhere (10 + False == 10 --
not True, and even bool(10) + False != True).

 > So, my view would be: Py3-only tutorials can and probably should omit
 > it,

But this doesn't make things simpler.  It means that there are two
syntaxes to define some classes, and you want to make one of them
TOOWTDI for classes derived directly from object, and the other
TOOWTDI for non-trivial subclasses.  I'll grant that in some sense
it's no more complex, either, of course.

Note that taken to extremes, your argument could be construed as "we
should define defaults for all arguments and omit them where possible".

Of course for typing in quick programs, and for trivial classes,
omitting the derivation from object is a useful convenience.  But I
don't think it's something that should be encouraged in tutorials.

Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-10 Thread Stephen J. Turnbull
Glenn Linderman writes:

 > On 8/10/2014 1:24 AM, Stephen J. Turnbull wrote:
 > > Actually ... if I were a fan of the "".join() idiom, I'd seriously
 > > propose 0.sum(numeric_iterable) as the RightThang{tm].  Then we could
 > > deprecate "".join(string_iterable) in favor of "".sum(string_iterable)
 > > (with the same efficient semantics).

 > Actually, there is no need to wait for 0.sum() to propose "".sum... but 
 > it is only a spelling change, so no real benefit.

IMO it's worse than merely a spelling change, because (1) "join" is a
more evocative term for concatenating strings than "sum" and (2) I
don't know of any other sums that allow "glue".

I'm overall -1 on trying to change the current situation (except for
adding a join() builtin or str.join class method).  We could probably
fix everything in a static-typed language (because that would allow
picking an initial object of the appropriate type), but without that
we need to pick a default of some particular type, and 0 makes the
most sense.

I can understand the desire of people who want to use the same syntax
for summing an iterable of numbers and for concatenating an iterable
of strings, but to me they're really not even formally the same in
practical use.  I'm very sympathetic to Steven's explanation that "we
wouldn't be having this discussion if we used a different operator for
string concatenation".  Although that's not the whole story: in
practice even numerical sums get split into multiple functions because
floating point addition isn't associative, and so needs careful
treatment to preserve accuracy.  At that point I'm strongly +1 on
abandoning attempts to "rationalize" summation.

I'm not sure how I'd feel about raising an exception if you try to sum
any iterable containing misbehaved types like float.  But not only
would that be a Python 4 effort due to backward incompatibility, but
it sorta contradicts the main argument of proponents ("any type
implementing __add__ should be sum()-able").

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-11 Thread Stephen J. Turnbull
Chris Barker - NOAA Federal writes:

 > Is there anything in the language spec that says string concatenation is
 > O(n^2)? Or for that matter any of the performs characteristics of build in
 > types? Those striker as implementation details that SHOULD be particular to
 > the implementation.

Container concatenation isn't quadratic in Python at all.  The naive
implementation of sum() as a loop repeatedly calling __add__ is
quadratic for them.  Strings (and immutable containers in general) are
particularly horrible, as they don't have __iadd__.

You could argue that sum() being a function of an iterable isn't just
a calling convention for a loop encapsulated in a function, but rather
a completely different kind of function that doesn't imply anything
about the implementation, and therefore that it should dispatch on
type(it).  But explicitly dispatching on type(x) is yucky (what if
somebody wants to sum a different type not currently recognized by the
sum() builtin?) so, obviously, we should define a standard __sum__
dunder!  IMO we'd also want a homogeneous_iterable ABC, and a concrete
homogeneous_iterable_of_TYPE for each sum()-able TYPE to help users
catch bugs injecting the wrong type into an iterable_of_TYPE.

But this still sucks.  Why?  Because obviously we'd want the
attractive nuisance of "if you have __add__, there's a default
definition of __sum__" (AIUI, this is what bothers Alexander most
about the current situation, at least of the things he's mentioned, I
can really sympathize with his dislike).  And new Pythonistas and lazy
programmers who only intend to use sum() on "small enough" iterables
will use the default, and their programs will appear to hang on
somewhat larger iterable, or a realtime requirement will go
unsatisfied when least expected, or   If we *don't* have that
property for sum(), ugh!  Yuck!  Same old same old!  (IMHO, YMMV of
course)

It's possible that Python could provide some kind of feature that
would allow an optimized sum function for every type that has __add__,
but I think this will take a lot of thinking.  *Somebody* will do it
(I don't think anybody is +1 on restricting sum() to a subset of types
with __add__).  I just think we should wait until that somebody appears.

 > Should we cripple the performance of some operation in Cpython so that it
 > won't work better that Jython?

Nobody is crippling operations.  We're prohibiting use of a *name* for
an operation that is associated (strongly so, in my mind) with an
inefficient algorithm in favor of the *same operation* by a different
name (which has no existing implementation, and therefore Python
implementers are responsible for implementing it efficiently).  Note:
the "inefficient" algorithm isn't inefficient for integers, and it
isn't inefficient for numbers in general (although it's inaccurate for
some classes of numbers).

 > Seems the same argument [that Python language doesn't prohibit
 > optimizations in particular implementations just because they
 > aren't made in others] could be made for sum(list_of_strings).

It could.  But then we have to consider special-casing every builtin
type that provides __add__, and we impose an unobvious burden on user
types that provide __add__.

 > > It seems pretty pedantic to say: we could make this work well,
 > > but we'd rather chide you for not knowing the "proper" way to do
 > > it.

Nobody disagrees.  But backward compatibility gets in the way.

 > But sum() is not inherently quadratic -- that's a limitation of the
 > implementation.

But the faulty implementation is the canonical implementation, the
only one that can be defined directly in terms of __add__, and it is
efficient for non-container types.[1]

 > "".join _could_ be naively written with the same poor performance
 > -- why should users need to understand why one was optimized and
 > one was not?

Good question.  They shouldn't -- thus the prohibition on sum()ing
strings.

 > That is a very import a lesson to learn, sure, but python is not
 > only a teaching language. People will need to learn those lessons
 > at some point, this one feature makes little difference.

No, it makes a big difference.  If you can do something, then it's OK
to do it, is something Python tries to implement.  If sum() works for
everything with an __add__, given current Python language features
some people are going to end up with very inefficient code and it will
bite some of them (and not necessarily the authors!) at some time.

If it doesn't work for every type with __add__, why not?  You'll end
up playing whack-a-mole with type prohibitions.  Ugh.

 > Sure, but I think all that does is teach people about a cpython specific
 > implementation -- and I doubt naive users get any closer to understanding
 > algorithmic complexity -- all they learn is you should use string.join().
 > 
 > Oh well, not really that big a deal.

Not to Python.  Maybe not to you.  But I've learned a lot about
Pythonic ways of doing things trying to channe

Re: [Python-Dev] sum(...) limitation

2014-08-11 Thread Stephen J. Turnbull
Ethan Furman writes:
 > On 08/11/2014 08:50 PM, Stephen J. Turnbull wrote:
 > > Chris Barker - NOAA Federal writes:
 > >
 > >> It seems pretty pedantic to say: we could make this work well,
 > >> but we'd rather chide you for not knowing the "proper" way to do
 > >> it.
 > >
 > > Nobody disagrees.  But backward compatibility gets in the way.
 > 
 > Something that currently doesn't work, starts to.  How is that a
 > backward compatibility problem?

I'm referring to removing the unnecessary information that there's a
better way to do it, and simply raising an error (as in Python 3.2,
say) which is all a RealProgrammer[tm] should ever need!

That would be a regression and backward incompatible.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-12 Thread Stephen J. Turnbull
Redirecting to python-ideas, so trimming less than I might.

Chris Barker writes:
 > On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull 
 > wrote:
 > 
 > > I'm referring to removing the unnecessary information that there's a
 > >  better way to do it, and simply raising an error (as in Python 3.2,
 > > say) which is all a RealProgrammer[tm] should ever need!
 > >
 > 
 > I can't imagine anyone is suggesting that -- disallow it, but don't tell
 > anyone why?

As I said, it's a regression.  That's exactly the behavior in Python 3.2.

 > The only thing that is remotely on the table here is:
 > 
 > 1) remove the special case for strings -- buyer beware -- but consistent
 > and less "ugly"

It's only consistent if you believe that Python has strict rules for
use of various operators.  It doesn't, except as far as they are
constrained by precedence.  For example, I have an application where I
add bytestrings bytewise modulo N <= 256, and concatenate them.  In
fact I use function call syntax, but the obvious operator syntax is
'+' for the bytewise addition, and '*' for the concatenation.

It's not in the Zen, but I believe in the maxim "If it's worth doing,
it's worth doing well."  So for me, 1) is out anyway.

 > 2) add a special case for strings that is fast and efficient -- may be as
 > simple as calling "".join() under the hood --no more code than the
 > exception check.

Sure, but what about all the other immutable containers with __add__
methods?  What about mappings with key-wise __add__ methods whose
values might be immutable but have __add__ methods?  Where do you stop
with the special-casing?  I consider this far more complex and ugly
than the simple "sum() is for numbers" rule (and even that is way too
complex considering accuracy of summing floats).

 > And I doubt anyone really is pushing for anything but (2)

I know that, but I think it's the wrong solution to the problem (which
is genuine IMO).  The right solution is something generic, possibly a
__sum__ method.  The question is whether that leads to too much work
to be worth it (eg, "homogeneous_iterable").

 > > Because obviously we'd want the attractive nuisance of "if you
 > > have __add__, there's a default definition of __sum__"
 > 
 > now I'm confused -- isn't that exactly what we have now?

Yes and my feeling (backed up by arguments that I admit may persuade
nobody but myself) is that what we have now kinda sucks[tm].  It
seemed like a good idea when I first saw it, but then, my apps don't
scale to where the pain starts in my own usage.

 > > It's possible that Python could provide some kind of feature that
 > > would allow an optimized sum function for every type that has
 > > __add__, but I think this will take a lot of thinking.
 > 
 > does it need to be every type? As it is the common ones work fine already
 > except for strings -- so if we add an optimized string sum() then we're
 > done.

I didn't say provide an optimized sum(), I said provide a feature
enabling people who want to optimize sum() to do so.  So yes, it needs
to be every type (the optional __sum__ method is a proof of concept,
modulo it actually being implementable ;-).

 > > *Somebody* will do it (I don't think anybody is +1 on restricting
 > > sum() to a subset of types with __add__).
 > 
 > uhm, that's exactly what we have now

Exactly.  Who's arguing that the sum() we have now is a ticket to
Paradise?  I'm just saying that there's probably somebody out there
negative enough on the current situation to come up with an answer
that I think is general enough (and I suspect that python-dev
consensus is that demanding, too).

 > sum() can be used for any type that has an __add__ defined.

I'd like to see that be mutable types with __iadd__.

 > What I fail to see is why it's better to raise an exception and
 > point users to a better way, than to simply provide an optimization
 > so that it's a mute issue.

Because inefficient sum() is an attractive nuisance, easy to overlook,
and likely to bite users other than the author.

 > The only justification offered here is that will teach people that summing
 > strings (and some other objects?)

Summing tuples works (with appropriate start=tuple()).  Haven't
benchmarked, but I bet that's O(N^2).

 > is order(N^2) and a bad idea. But:
 > 
 > a) Python's primary purpose is practical, not pedagogical (not that it
 > isn't great for that)

My argument is that in practical use sum() is a bad idea, period,
until you book up on the types and applications where it *does* work.
N.B. It doesn't even work properly for numbers (inac

Re: [Python-Dev] Bytes path support

2014-08-19 Thread Stephen J. Turnbull
Ben Hoyt writes:

 > Fair enough. I don't quite understand, though -- why is the "official
 > policy" to kill something that's "essential" on *nix?

They're not essential on *nix.  Unix paths at the OS level are "just
bytes" (even on Mac, although the most common Mac filesystem does
enforce UTF-8 Unicode NFD).  This use case is now perfectly well
served by codecs.

However, there are a lot of applications that involve reading a file
name from a directory, and passing it verbatim to another OS
function.  This case can be handled now using the surrogateescape
error handler, but when these APIs were introduced we didn't even have
a reliable way to roundtrip filenames because a Unix filename doesn't
need to be a string of characters from *any* character set.

And there's the undeniable convenience of treating file names as
opaque objects in those applications.

Regards,

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-19 Thread Stephen J. Turnbull
Greg Ewing writes:
 > Stephen J. Turnbull wrote:
 > 
 > > This case can be handled now using the surrogateescape
 > > error handler,
 > 
 > So maybe the way to make bytes paths go away is to always
 > use surrogateescape for paths on unix?

Backward compatibility rules that out, I think.  I certainly would
recommend that for new code, but even for new code there are many
users who vehemently object to using Unicode as an intermediate
representation of things they think of as binary blobs.  Not worth the
hassle to even seriously propose removing those APIs IMO.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-19 Thread Stephen J. Turnbull
Guido van Rossum writes:
 > On Tuesday, August 19, 2014, Stephen J. Turnbull  wrote:
 > > Greg Ewing writes:

 > >  > So maybe the way to make bytes paths go away is to always
 > >  > use surrogateescape for paths on unix?
 > >
 > > Backward compatibility rules that out, I think.  I certainly would
 > > recommend that for new code, but even for new code there are many
 > > users who vehemently object to using Unicode as an intermediate
 > > representation of things they think of as binary blobs.  Not worth the
 > > hassle to even seriously propose removing those APIs IMO.
 > 
 > But maybe we don't have to add new ones?

IMO, we should avoid it.

There may be some use cases.  Sergiy mentions two bug reports.

http://bugs.python.org/issue19997 imghdr.what doesn't accept bytes paths
http://bugs.python.org/issue20797 zipfile.extractall should accept bytes path 
as parameter

I'm very unsympathetic to these.  In both cases the bytes are coming
from outside of module in question.  Why are they in bytes?  That
question should scare you, because from the point of view of end users
there are no good answers: they all mean that the end user is going to
end up with uninterpretable bytes in their directories, for the
convenience of the programmer.

In the case of issue20797, I'd be a *little* sympathetic if the RFE
were for the *members* argument.  zipfiles evidently have no way to
specify the encodings of the name(s) of their members (and the zipfile
module doesn't have APIs for it!), so the programmer is kind of stuck,
especially if the requirement is that the extraction require no user
intervention.  But again, this is rarely what the user wants.

I would be sympathetic to an internal, bytes-based, "kids these stunts
are performed by trained professionals do NOT try this at home" API,
with a sane user-oriented str-based API for ordinary use for this
module.  I suppose it might be useful for such a multi-type API to be
polymorphic, but it would have to be a "if there are bytes anywhere,
everything must be bytes and return values will be bytes" and
similarly for str kind of polymorphism.  No mixing bytes and strings,
period.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-19 Thread Stephen J. Turnbull
Marko Rauhamaa writes:

 > Unix programmers, though, shouldn't be shielded from bytes.

Nobody's trying to do that.  But Python users should be shielded from
Unix programmers.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-20 Thread Stephen J. Turnbull
Nick Coghlan writes:

 > One idea I had along those lines is a surrogatereplace error handler (
 > http://bugs.python.org/issue22016) that emitted an ASCII question mark for
 > each smuggled byte, rather than propagating the encoding problem.

Please, don't.

"Smuggled bytes" are not independent events.  They tend to be
correlated *within* file names, and this handler would generate names
whose human semantics get lost (and there *are* human semantics,
otherwise the name would be str(some_counter)).  They tend to be
correlated across file names, and this handler will generate multiple
files with the same munged name (and again, the differentiating human
semantics get lost).

If you don't know the semantics of the intended file names, you can't
generate good replacement names.  This has to be an application-level
function, and often requires user intervention to get good names.

If you want to provide helper functions that applications can use to
clean names explicitly, that might be OK.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-21 Thread Stephen J. Turnbull
Marko Rauhamaa writes:

 > My point is that the poor programmer cannot ignore the possibility of
 > "funny" character sets.

*Poor* programmers do it all the time.  That's why Python codecs raise
when they encounter bytes they can't handle.

 > If Python tried to protect the programmer from that possibility,

I don't understand your point.  The existing interfaces aren't going
anywhere, and they're enough to do anything you need to do.  Although
there are a few radicals (like me in a past life :-) who might like to
see them go away in favor of opt-in to binary encoding via
surrogateescape error handling, nobody in their right mind supports
us.

The question here is not about going backward, it's about whether to
add new bytes APIs, and which ones.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-21 Thread Stephen J. Turnbull
Chris Barker - NOAA Federal writes:

 > This brings up the other key problem. If file names are (almost)
 > arbitrary bytes, how do you write one to/read one from a text file
 > with a particular encoding? ( or for that matter display it on a
 > terminal)

"Very carefully."

But this is strictly from need.  *Nobody* (with the exception of the
crackers who like to name their programs things like "\u0007") *wants*
to do this.  Real people want to name their files in some human
language they understand, and spell it in the usual way, and encode
those characters as bytes in the usual way.

Decoding those characters in the usual way and getting nonsense is the
exceptional case, and it must be the application's or user's problem
to decide what to do.  They know where they got the file from and
usually have some idea of what its name should look like.  Python
doesn't, so Python cannot solve it for them.

For that reason, I believe that Python's "normal"/high-level approach
to file handling should treat file names as (human-oriented) text.  Of
course Python should be able to handle bytes straight from the disk,
but most programmers shouldn't have to.

 > And people still want to say posix isn't broken in this regard?

Deal with it, bro'.





___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-23 Thread Stephen J. Turnbull
Chris Barker writes:

 > > The third is to specify the UTF-8 with the surrogate escape error
 > > handler.  This allows non-UTF-8 codes to be loaded into
 > > memory.

Read as bytes and incrementally decode.  If you hit an Exception,
retry from that point.

 > Just so I'm clear here -- if you write that back out, encoded as
 > utf-8 -- you'll get the exact same binary blob out as came in?

If and only if there are no changes to the content.

 > I wonder if this would make it hard to preserve byte boundaries,
 > though.

I'm not sure what you mean by "byte boundaries".  If you mean
after concatenation of such objects, yes, the uninterpretable bytes
will be encoded in such a way as to be identifiable as lone bytes;
they won't be interpreted as Unicode characters.

 > By the way, IIUC correctly, you can also use the python latin-1
 > decoder -- anything latin-1 will come through correctly, anything
 > not valid latin-1 will come in as garbage, but if you re-encode
 > with latin-1 the original bytes will be preserved. I think this
 > will also preserve a 1:1 relationship between character count and
 > byte count, which could be handy.

Bad idea, especially for Oleg's use case -- you can't decode those by
codec without reencoding to bytes first.  No point in abandoning
codecs just because there isn't one designed for his use case exactly.
Just read as bytes and decode piecewise in one way or another.  For
Oleg's HTML case, there's a well-understood structure that can be used
to determine retry points and a very few plausible coding systems,
which can be fairly well distinguished by the range of bytes used and
probably nearly perfectly with additional information from the
structure and distribution of apparently decoded characters.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-23 Thread Stephen J. Turnbull
Chris Angelico writes:

 > Not sure why 1251,

All of those codes have repertoires that are Cyrillic supersets,
presumably Russian-language content, based on Oleg's top domain.

 > But it's important to note that this is a method of handling junk.
 > It's not a design intention; this is for a situation where I really
 > want to cope with any byte stream and attempt to display it as text.
 > And if I get something that's neither UTF-8 nor CP-1252, I will
 > display it wrongly, and there's nothing can be done about that.

Of course there is.  It just gets more heuristic the more numerous the
potential encodings are.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-23 Thread Stephen J. Turnbull
Chris Barker writes:

 > So I write bytes that are encoded one way into a text file that's encoded
 > another way, and expect to be abel to read that later?

No, not you.  Crap software does that.  Your MUD server.  Oleg's
favorite web pages with ads, or more likely the ad servers.

 > Not for me (or many other users) -- terminals are sometimes set
 > with ascii-only encoding,

So?  That means you can't handle text files in general, only those
restricted to ASCII.  That's a completely different issue.

 > Python3 supports this case very well. But it does indeed make it
 > hard to work with filenames when you don't know the encoding they
 > are in.

No, it doesn't.  Reasonably handling "text streams" in unknown,
possibly multiple, encodings is just hard.  Python 3 has nothing to do
with it, and Oleg should know that very well.

It's true that code written in Python 2 to handle these issues needs
to be ported to Python 3.  Things is, Oleg says "another tool" -- any
non-Python-2 tool will need porting of his code too.

 > And apparently that's pretty common -- or common enough that it
 > would be nice for Python to support it well. This trick is how --
 > we'd like the "just pass it around and do path manipulations" case
 > to work with (almost) arbitrary bytes,

It does.  That's what os.path is for.

 > but everything else to work naturally with text (unicode text).

No gloss, please.  It's text, period.  The internal Unicode encoding
is *not exposed*, with a few (important) exceptions such as Han
unification.

 > I think the way to do this is to abstract the path concept, like pathlib
 > does.

You forgot to append the word "well".

 > From my personal experience, non-ascii filenames are much easier to
 > deal with if I use unicode for filenames everywhere (py2). Somehow,
 > I have yet to be bitten by mixed encoding in filenames.

.gov domain?  ASCII-only terminal settings?  It's not "somehow", it's
that you live a sheltered life.

 > So will using a surrogate-escape error handling with pathlib make
 > all this just work?

Not answerable until you define "all this" more precisely.

And that's the big problem with Oleg's complaint, too.  It's not at
all clear what he wants, except that all of his current code should
continue to work in Python 3.  Just like all of us.  The question then
is persuading him that it's worth moving to Python 3 despite the
effort of porting Python-2-specific code.  Maybe he can be persuaded,
maybe not.  Python 2 is a better than average language.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-23 Thread Stephen J. Turnbull
Oleg Broytman writes:

 >This is the core of the problem. Python2 favors Unix model but
 > Windows people pays the price. Python3 reverses that

This is certainly not true.  What is true is that Python 3 makes no
attempt to make it easy to write crappy software in the old Unix
style, that breaks when unexpected character encoding are encountered.
Python 3 is designed to make it easier to write reliable software,
even if it will only ever be used on one platform.  Nevertheless, it's
still a reasonable language for writing byte-shoveling software, with
the last piece in place as of the acceptance of PEP 461.

As of that PEP, you can use regexps for tokenizing byte streams and
%-formatting to conveniently produce them.  If you want to treat them
piecewise as character streams with different encodings, you have a
large library of codecs, which provide an incremental decoder
interface.  While AFAIK no codec implements a decode-until-error mode,
that's not all that much of a loss, as many encodings overlap.  Eg, if
you start decoding using a latin-1 codec, decoding the whole document
will succeed, even if it switches to windows-1251 in the meantime.

Oleg, I gather Russian is your native language.  That's moderately
complicated, I admit.  But the Russians are a distant second to the
Japanese in self-destructive proliferation of incompatible character
coding standards and non-standard variants.  After 24 years of dealing
with the mess that is East Asian encodings (which is even bound up
with the "religion" of Japanese exceptionalism -- some Japanese have
argued that there is a spiritual superiority to Japanese JIS codes!),
I cannot believe you are going to find a better environment for
dealing with these issues than Python 3.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path related questions for Guido

2014-08-25 Thread Stephen J. Turnbull
Nick Coghlan writes:

 > "purge_surrogate_escapes" was the other term that occurred to me.

"purge" suggests removal, not replacement.  That may be useful too.

neutralize_surrogate_escapes(s, remove=False, replacement='\uFFFD')

maybe?  (Of course the remove argument is feature creep, so I'm only
about +0.5 myself.  And the name is long, but I can't think of any
better synonyms for "make safe" in English right now).

 > Either way, my use case is to filter them out when I *don't* want to
 > pass them along to other software, but would prefer the Unicode
 > replacement character to the ASCII question mark created by using the
 > "replace" filter when encoding.

I think it would be preferable to be unicodely correct here by
default, since this is a str -> str function.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-25 Thread Stephen J. Turnbull
R. David Murray writes:

 > Also, as has been discussed in this thread previously, any program that
 > deals with filenames is dealing with human readable languages, even
 > if posix itself treats the filenames as bytes.

That's a bit extreme.  I can name two interesting applications
offhand: git's object database and the Coda filesystem's containers.

It's true that for debugging purposes bytestrings representing largish
numbers are readably encoded (in hexadecimal and decimal,
respectively), but they're clearly not "human readable" in the sense
you mean.

Nevertheless, these are the applications that prove your rule.  You
don't need the power of pathlib to conveniently (for the programmer)
and efficiently handle the file structures these programs use.
os.path is plenty.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-25 Thread Stephen J. Turnbull
Isaac Morland writes:

 > I like your way of putting this - "straight face" indeed.  The third 
 > option really is a hack to allow working around nonsensical situations 
 > (and even the META tag is pretty questionable).  All this complexity 
 > because people can't be bothered to do things properly.

At least in Japan and Russia, doing things "properly" in your sense in
heterogenous distributed systems is really hard, requiring use of
rather fragile encoding detection heuristics that break at the
slightest whiff of encodings that are unusual in the particular
locale, and in Japan requiring equally fragile transcoding programs
that break on vendor charset variations.  The META "charset" attribute
is useful in those contexts, and the "charset" attribute for external
elements may have been useful in the past as well, although I've never
needed it.

I agree that an environment where "charset" attributes on META and
other elements are needed kinda sucks, but the prerequisite for "doing
things properly" is basically Unicode[1], and that just wasn't going
to happen until at least the 1990s.  To make the transition in less
than several decades would have required a degree of monopoly in
software production that I shudder to contemplate.  Even today there
are programmers around the world grumbling about having to deal with
the Unicode coded character set.


Footnotes: 
[1]  More precisely, a universal coded character set.  TRON code or
MULE code would have done (but yuck!)  ISO 2022 won't do!

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-26 Thread Stephen J. Turnbull
Nikolaus Rath writes:

 > In that case, maybe it'd be nice to also explain why you use the
 > term "bilingual" for codepage based encoding.

Modern computing systems are written in languages which are invariably
based on syntax expressed using ASCII, and provide by default
functionality for expressing dates etc suitable for rendering American
English.  Thus ASCII (ie, American English) is always an available
language.  Code pages provide facilities for rendering one or more
languages languages sharing a common coded character set, but are
unsuitable for rendering most of the rest of the world's dozens of
language groups (grouping languages by common character set).

Multilingual has come to mean "able to express (almost) any set of
languages in a single text" (see, for example, Emacs's "HELLO" file),
not just "more than two".  So code pages are closer in spirit to
"bilingual" (two of many) than to "multilingual" (all of many).

It's messy, analogical terminology.  But then, natural language is
messy and analogical.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path related questions for Guido

2014-08-27 Thread Stephen J. Turnbull
Glenn Linderman writes:
 > On 8/26/2014 4:31 AM, MRAB wrote:
 > > On 2014-08-26 03:11, Stephen J. Turnbull wrote:
 > >> Nick Coghlan writes:

 > > How about:
 > >
 > > replace_surrogate_escapes(s, replacement='\uFFFD')
 > >
 > > If you want them removed, just pass an empty string as the
 > > replacement.

That seems better to me (I had too much C for breakfast, I think).

 > And further, replacement could be a vector of 128 characters, to do
 > immediate transcoding,

Using what encoding?  If you knew that much, why didn't you use
(write, if necessary) an appropriate codec?  I can't envision this
being useful.

OTOH, I could see using

replace_surrogate_escapes(s, replacement='�')

in HTML.  (Actually, probably not; if it makes sense to use Unicode
features you're probably using Unicode as the external encoding, so a
character entity is silly.  But there might be contexts with a useful
multicharacter replacements.)

 > or a single character to do wholesale replacement with some
 > gibberish character, or None to remove (or an empty string).

Not None, that means default (which should be the Unicode standard
REPLACEMENT CHARACTER U+FFFD).

Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-27 Thread Stephen J. Turnbull
Glenn Linderman writes:
 > On 8/27/2014 5:16 AM, Nick Coghlan wrote:

 > > Choosing UTF-8 aims to treat formatting text for communication with 
 > > the user as "just a display issue". It's a low impact design that will 
 > > "just work" for a lot of software, but it comes at a price:
 > >
 > >   * because encoding consistency checks are mostly avoided, data in
 > > different encodings may be freely concatenated and passed on to
 > > other applications. Such data is typically not usable by the
 > > receiving application.
 > 
 > I don't believe this is a necessary result of using UTF-8.

No, it's not, but if you're going to do the same kind of checks that
are necessary for transcoding UTF-8 to abstract Unicode, there's no
benefit to using UTF-8 internally, and you lose a lot.  The only
operations that you can do efficiently are concatenation and
iteration.  I've worked with a UTF-8-like internal encoding for 20
years now -- it's a huge cost.

 > Python3 could have evolved to using UTF-8 as its underlying data
 > format, and obtained equal encoding consistency as it has today.

Thank heaven it didn't!

 > One of the choices of Python3, was to retain character indexing as an 
 > underlying arithmetic implementation citing algorithmic speed, but that 
 > is a seldom needed operation,

That simply isn't true.  The negative effects of algorithmic slowness
in Emacsen are visible both as annoying user delays, and as excessive
developer concentration on optimizing a fundamentally insufficient
data structure.

 > and of limited general applicability when considering grapheme
 > clusters.  An iterator based approach can solve both problems,

On the contrary, grapheme clusters are the relatively rare use case in
textual computing, at least currently, that can be optimized for when
necessary.  There's no problem with creating iterators from arrays,
but making an iterator behave like a array ... well, that involves
creating the array.

 > Such solutions could still be implemented as options.

Sure, but the problems to be solved in that implementation are not due
to Python 3's internal representation.  A lot of painstaking (and
possibly hard?) work remains to be done.

 > A high-performance implementation would likely need to be
 > implemented at least partly in C rather than CPython,

That's how Emacs did it, and (a) over the decades it has involved an
inordinate amount of effort compared to rewriting the text-handling
functions for an array, (b) is fragile, and (c) performance sucks in
practice.

Unicode, not UTF-8, is the central component of the solution.  The
various UTFs are application-specific implementations of Unicode.
UTF-8 is an excellent solution for text streams, such as disk files
and network communication.  Fixed-width representations (ISO-8859-1,
UCS-2, UTF-32, PEP-393) are useful for applications of large buffers
that need O(1) "random" access, and can trivially be iterated for
stream applications.

Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path related questions for Guido

2014-08-27 Thread Stephen J. Turnbull
Glenn Linderman writes:
 > On 8/27/2014 6:08 PM, Stephen J. Turnbull wrote:
 > > Glenn Linderman writes:

 > >   > And further, replacement could be a vector of 128 characters, to do
 > >   > immediate transcoding,
 > >
 > > Using what encoding?
 > 
 > The vector would contain the transcoding. Each lone surrogate would map 
 > to a character in the vector.

Yes, that's obvious.  The question is where do you get the vector?

 > > If you knew that much, why didn't you use (write, if necessary)
 > > an appropriate codec?  I can't envision this being useful.
 > 
 > If the data format describes its encoding, possibly containing data from 
 > several encodings in various spots, then perhaps it is best read as 
 > binary, and processed as binary until those definitions are found.

Exactly.  That's precisely why bytes have a .decode method.

 > But an alternative would be to read with surrogate escapes, and
 > then when the encoding is determined, to transcode the data.

Not every one-line expression needs to be in the stdlib:

data[start, end] = data[start, end].encode('utf-8', 
errors=surrogateescape).decode('DTRT-now')

Note that you *do* need to know start and end, because of the
possibility of "several encodings", where once you apply this
technique to the whole text, you can't recover the surrogates when you
get the encoding wrong.

 > Previously, a proposal was made to reverse the surrogate escapes to
 > the original bytes, and then apply the (now known) appropriate
 > codec.

Sure.  And in fact I do this kind of thing all the time in Emacs,
using the decode(encode(slice)) approach.  The only times in 25 years
of working with the insanity of digitized Japanese I've had a use for
anything other than that is when I don't have a round-tripping codec.
In that case I have to preserve the bytes or suffer lossy conversion
anyway, regardless of the method used to reconvert.

But surrogateescape is necessarily round-tripping (maybe with a few
exceptions in Chinese and a very small number in other languages, but
those failures are due to Unicode, not to surrogateescape).

 > There are not appropriate codecs that can convert directly from
 > surrogate escapes to the desired end result.

And there currently cannot be.  codecs are bytes<->str, not str->str.

 > This technique could be used instead, for single-byte, non-escaped
 > encodings.

That's pure theory, not a use case.  We have codecs for all the
encodings with significant numbers of users, and writing a new one
simply isn't that hard.

Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Cleaning up surrogate escaped strings (was Bytes path related questions for Guido)

2014-08-28 Thread Stephen J. Turnbull
Nick Coghlan writes:

 > The current proposal on the issue tracker is to instead take advantage of
 > the existing error handlers:
 > 
 > def convert_surrogateescape(data, errors='replace'):
 > return data.encode('utf-8', 'surrogateescape').decode('utf-8', 
 > errors)
 > 
 > That code is short, but semantically dense

And it doesn't implement your original suggestion of replacement with
'?' (and another possibility for history buffs is 0x1A, ASCII SUB).  At
least, AFAICT from the docs there's no way to specify the replacement
character; decoding always uses U+FFFD.  (If I knew how to do that, I
would have suggested this.)

 > (Added bonus: once you're alerted to the possibility, it's trivial
 > to write your own version for existing Python 3 versions.

I'm not sure that's true.  At least, to me that code was obvious -- I
got the exact definition (except for the function name) on the first
try -- but I ruled it out because it didn't implement your suggestion
of replacement with '?', even as an option.

OTOH, I think a lot of the resistance to codec-based solutions is the
misconception that en/decoding streams is expensive, or the
misconception that Python's internal representation of text as an
array of code points (rather than an array of "characters" or
"grapheme clusters") is somehow insufficient for text processing.

Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] surrogatepass - she's a witch, burn 'er! [was: Cleaning up ...]

2014-08-28 Thread Stephen J. Turnbull
In the process of booking up for my other post in this thread, I
noticed the 'surrogatepass' handler.

Is there a real use case for the 'surrogatepass' error handler?  It
seems like a horrible break in the abstraction.  IMHO, if there's a
need, the application should handle this.  Python shouldn't provide
it on encoding as the resulting streams are not Unicode conformant,
nor on decoding UTF-16, as conversion of surrogate pairs is a
requirement of all Unicode versions since about 1995.

Steve

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] surrogatepass - she's a witch, burn 'er!

2014-08-29 Thread Stephen J. Turnbull
Greg Ewing writes:
 > M.-A. Lemburg wrote:
 > > we needed
 > > a way to make sure that Python 3 also optionally supports working
 > > with lone surrogates in such UTF-8 streams (nowadays called CESU-8:
 > > http://en.wikipedia.org/wiki/CESU-8).

Besides what Greg says, CESU-8 is an UTF, and therefore encodes valid
Unicode.  Speaking imprecisely, CESU-8 is UTF-16 with variable-width
code units (ie, each 16-bit code point is represented using the UTF-8
variable-width representation).[1]

I think you are thinking of Markus Kuhn's utf-8b (which I believe is
exactly what is implemented by the surrogateescape handler).

As far as the goal of "working with lone surrogates in such UTF-8
streams", the surrogateescape handler already permits that, and does
so consistently across streams in the sense that lone surrogates in
the UTF-8 stream cannot be mixed with garbage bytes decoded by
surrogateescape in another stream, which produces an unencodable mess.

I still don't see a justification for the surrogatepass handler.  What
applications are producing (not merely passing through) UTF-8-encoded
surrogates these days?


Footnotes: 
[1]  For the curious, it's imprecise because in Unicode code units are
fixed-width by definition.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-08-30 Thread Stephen J. Turnbull
[email protected] writes:

 > BTW, it's patented:
 > 
 > http://www.google.de/patents/US6816900

Damn them.  I hope they never get a look at my crontab.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-09-02 Thread Stephen J. Turnbull
Antoine Pitrou writes:
 > On Tue, 2 Sep 2014 16:47:35 -0700
 > Glyph Lefkowitz  wrote:

 > > As we keep saying, this is not a break in backwards
 > > compatibility, it's a bug fix.
 > 
 > Keeping saying it doesn't make it magically true.

It's not "magically" true, it is "just" true.  What the hardliners
fail to acknowledge is that this is *not a bug in Python, it's a bug
in the whole system*, and *mostly* in the environment.  Changing
Python will not change the environment, and applications will fail,
with unknown consequences.  Saying they "should" fail *right* now is
bogus when you don't even know what those applications are, or what
other security measures may be in place:

Now is better than never.
Although never is often better than *right* now.

On the other hand, I commend the Twisted developers for putting their
values into their code with their reputation on the line.  I hope they
win big with this move!  Shouldn't we all hope for that?

Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Sad status of Python 3.x buildbots

2014-09-02 Thread Stephen J. Turnbull
Nick Coghlan writes:

 > Sorry, I haven't been a very good maintainer for that buildbot (the main
 > reason it never graduated to the "stable" list). If you send me your public
 > SSH key, I can add it (I think - if not, I can ask Luke to do it).
 > Alternatively, CentOS 6 may exhibit the same problem.

I wonder how many of these buildbots could be maintained by the kind
of folks who show up on core-mentorship asking "how can I help?"

Just a thought -- I wouldn't be surprised if the reaction is universal
horror and the answer is "Are you crazy?  Zero!  Z-E-R-O!!"

And of course most want to write code, not sysadm.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-09-03 Thread Stephen J. Turnbull
Guido van Rossum writes:

 > lot: five years ago (when I worked at Google!) it was common to find
 > internal services that required SSL but had a misconfigured certificate,
 > and the only way to access those services was to override the browser
 > complaints. Today (working at Dropbox, a much smaller company!) I don't
 > even remember the last time I had to deal with such a browser complaint --

I would tend to discount your recent experience, then.  Smaller (and
possibly even more important in this fast-developing area, younger)
organizations are a lot more nimble about things like this.

That is not intended to express an opinion about a backport, though.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed schedule for 3.4.2

2014-09-08 Thread Stephen J. Turnbull
Glenn Linderman writes:

 > Well, this thread seems to be top-posted so...

Not a good enough reason for me!

 > Why not provide _urlopen_with_scary_keyword_parameter as the 
 > monkey-patch option?
 > 
 > So after the (global to the module) monkeypatch, they would _still_ have 
 > to add the keyword parameter.

I understand the hardline position, though I don't like it: "if you
don't know how to do it yourself, we won't help you do it at all."[1]

But this "defense in depth" suggestion really violates the "consenting
adults" principle.  One warning in the docs and another in the name
itself should be enough, and if it isn't, Mommy should take Jimmy's
RaspberryPi away.

Footnotes: 
[1]  Personally, I think that taken seriously, this reasoning applies
to anybody who uses computers for anything other than programming,
though.  Should anybody be allowed to use computers, given that
they're going to put their personal data on Facebook for their
stalkers to see or inadvertently install botnet software with whatever
warez they are weak for?

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog

2014-09-11 Thread Stephen J. Turnbull
Jeff Allen writes:

 > A welcome article. One correction should be made, I believe: the area of 
 > code point space used for the smuggling of bytes under PEP-383 is not a 
 > "Unicode Private Use Area", but a portion of the trailing surrogate 
 > range.

Nice catch.  Note that the surrogate range was originally part of the
Private Use Area, but it was carved out with the adoption of UTF-16 in
about 1993.  In practice, I doubt that there are any current
implementations claiming compatibility with Unicode 1.0 (IIRC, UTF-16
was made mandatory in Unicode 1.1).

 > This is a code violation, which I imagine is why 
 > "surrogateescape" is an error handler, not a codec.

Yes.

 > I believe the private use area was considered and rejected for PEP-383. 
 > In an implementation of the type unicode based on UTF-16 (Jython), lone 
 > surrogates preclude a naive use of the platform string library. This is 
 > on my mind at the moment as I'm working several bugs in Jython's unicode 
 > type, and can see why it has been too difficult.

I've always thought that the "right" way to handle the private use
area for "platforms" like Python and Emacs, which may need to use it
for their own purposes (such as "undecodable bytes") but want to
respect its use by applications, is to create an auxiliary table
mapping the private use area to objects describing the characters
represented by the private use code points.  These objects would have
attributes such as external representation for text I/O, glyph (for
GUI display), repr (for TTY display), various Unicode properties, etc.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog

2014-09-12 Thread Stephen J. Turnbull
Jeff Allen writes:

 > Simply having a block "for private use" seems to create an unmanaged 
 > space for conflict,

No.  The uncharted range of human language (including recently-
invented nonsense like "emoticons" and the annual "design a character"
contest run by a newpaper in Taipei, with the grand prize being your
character gets added to the national standard IIRC, but maybe it's
just that newspaper's collection of private space characters) already
contains those conflicts.  Believe me, "private use space, manage it
yourself" was the best they could do.

I've been working with the beureaucratic insanity of the Japanese
national standard -- it took almost 3 decades before every Japanese
citizen could store their names in a computer using government-
approved codes -- and the chaos of the Taiwanese national standard --
which contains hordes of characters with one known use and no known
meaning, many of them duplicates -- for twenty years now.  Neither
approach works as well as Unicode's, despite its design-by-committee
flaws overlaid with national animosities that can flare into
linguicidal vetoes and code-space-stuffing logrolling.

 > reminiscent of the "other 128 characters" in bilingual
 > programming. I wondered if the way to respect use by applications
 > might be to make it private to a particular sub-class of str, idly
 > however.

If I understand your suggestion, that's precisely the intent of PEP
383, to make undecodable bytes in a coded character stream private.
But they need to be in the stream one way or another.  So PEP 383
chose to use a non-Unicode encoding (based on the "lone surrogate"
device invented by Markus Kuhn for utf-8b) to deal with that, and that
does effectively make those elements private to Python (but of course
not in the Unicode sense, as they're not even characters in Unicode).

But I gather the "native" Unicode type in Java doesn't allow you to
use that dodge because it checks for malformed Unicode internally (ie,
at a level not controllable by Jython).  So you have to embed such
stream elements in the space of Unicode characters.  You have the
option of the private space or unallocated (reserved) space.  The
latter seems like asking for trouble, and the only way to avoid it
would be to be prepared to move that data around in case of collision.
But that's precisely what I'm suggesting doing in private space.  Same
issue, either way.  Private space with a local registry seems saner.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog

2014-09-15 Thread Stephen J. Turnbull
Jim J. Jewett writes:

 > In terms of best-effort, it is reasonable to treat the smuggled bytes
 > as representing a character outside of your unicode repertoire

I have to disagree.  If you ever end up passing them to something that
validates or tries to reencode them without surrogateescape, BOOM!
These things are the text equivalent of IEEE NaNs.  If all you know
(as in the stdlib) is that you have "generic text", the only fairly
safe things to do with them are (1) delete them, (2) substitute an
appropriate replacement character for them, (3) pass the text
containing them verbatim to other code, and (4) reencode them using
the same codec they were read with.

 > -- so it won't ever match entirely valid strings, except perhaps
 > via a wildcard.  And it should still work for .endswith( invalid characters>).

Incorrect, I'm pretty sure, unless you know that both texts containing
 were read with the same codec.  Eg,
consider two filenames encoded in ISO Cyrillic and ISO Hebrew, read
with (encoding='ascii', errors='surrogateescape').

Apps that know the semantics of the text may DWIM/DTRT if they want
to, but FWIW-IMHO-YMMV-and-any-other-4-letter-caveat-acronyms-that-
may-apply Python and the stdlib shouldn't try to guess.

Guessing may be unavoidable, of course.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   5   6   7   8   9   10   >