Re: multiprocessing module and os.close(sys.stdin.fileno())

2009-02-20 Thread Joshua Judson Rosen
Jesse Noller  writes:
>
> On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
>  wrote:
> > Why is the multiprocessing module, ie., multiprocessing/process.py, in
> > _bootstrap() doing:
> >
> >  os.close(sys.stdin.fileno())
> >
> > rather than:
> >
> >  sys.stdin.close()
> >
> > Technically it is feasible that stdin could have been replaced with
> > something other than a file object, where the replacement doesn't have
> > a fileno() method.
> >
> > In that sort of situation an AttributeError would be raised, which
> > isn't going to be caught as either OSError or ValueError, which is all
> > the code watches out for.
> 
> I don't know why it was implemented that way. File an issue on the
> tracker and assign it to me (jnoller) please.

My guess would be: because it's also possible for sys.stdin to be a
file that's open in read+*write* mode, and for that file to have
pending output buffered (for example, in the case of a socketfile).

There's a general guideline, inherited from C, that one should ensure
that the higher-level close() routine is invoked on a given
file-descriptor in at most *one* process after that descriptor has
passed through a fork(); in the other (probably child) processes, the
lower-level close() routine should be called to avoid a
double-flush--whereby buffered data is flushed out of one process, and
then the *same* buffered data is flushed out of the (other)
child-/parent-process' copy of the file-object.

So, if you call sys.stdin.close() in the child-process in
_bootstrap(), then it could lead to a double-flush corrupting output
somewhere in the application that uses the multiprocessing module.

You can expect similar issues with just about /any/ `file-like objects'
that might have `file-like semantics' of buffering data and flushing
it on close, also--because you end up with multiple copies of the same
object in `pre-flush' state, and each copy tries to flush at some point.

As such, I'd recommend against just using .close(); you might use
something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
`else' clause unconditionally calls sys.stdin.close(), then you still
have double-flush problems if someone's set sys.stdin to a file-like
object with output-buffering.

I guess you could try calling that an `edge-case' and seeing if anyone
screams. It'd be sort-of nice if there was finer granularity in the
file API--maybe if file.close() took a boolean `flush' argument

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: `high overhead of multiple Python processes' (was: Will multithreading make python less popular?)

2009-02-21 Thread Joshua Judson Rosen
Paul Rubin  writes:
>
> Right, that's basically the issue here: the cost of using multiple
> Python processes is unnecessarily high.  If that cost were lower then
> we could more easily use multiple cores to make oru apps faster.

What cost is that? At least on unix systems, fork() tends have
*trivial* overhead in terms of both time and space, because the
processes use lazy copy-on-write memory underneath, so the real costs
of resource-consumption for spawning a new process vs. spawning a new
thread should be comparable.

Are you referring to overhead associated with inter-process
communication? If so, what overhead is that?

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing module and os.close(sys.stdin.fileno())

2009-02-21 Thread Joshua Judson Rosen
Graham Dumpleton  writes:
>
> On Feb 21, 4:20 pm, Joshua Judson Rosen  wrote:
> > Jesse Noller  writes:
> >
> > > On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
> > >  wrote:
> > > > Why is the multiprocessing module, ie., multiprocessing/process.py, in
> > > > _bootstrap() doing:
> >
> > > >  os.close(sys.stdin.fileno())
> >
> > > > rather than:
> >
> > > >  sys.stdin.close()
> >
> > > > Technically it is feasible that stdin could have been replaced with
> > > > something other than a file object, where the replacement doesn't have
> > > > a fileno() method.
> >
> > > > In that sort of situation an AttributeError would be raised, which
> > > > isn't going to be caught as either OSError or ValueError, which is all
> > > > the code watches out for.
> >
> > > I don't know why it was implemented that way. File an issue on the
> > > tracker and assign it to me (jnoller) please.
> >
> > My guess would be: because it's also possible for sys.stdin to be a
> > file that's open in read+*write* mode, and for that file to have
> > pending output buffered (for example, in the case of a socketfile).
> 
> If you are going to have a file that is writable as well as readable,
> such as a socket, then likely that sys.stdout/sys.stderr are going to
> be bound to it at the same time.

Yes.

> If that is the case then one should not be using close() at all

If you mean stdin.close(), then that's what I said :)

> as it will then also close the write side of the pipe and cause
> errors when code subsequently attempts to write to
> sys.stdout/sys.stderr.
>
> 
> In the case of socket you would actually want to use shutdown() to
> close just the input side of the socket.

Sure--but isn't this "you" the /calling/ code that set the whole thing
up? What the /caller/ does with its stdio is up to /him/, and beyond
the scope of the present discourse. I can appreciate a library forking
and then using os.close() on stdio (it protects my files from any I/O
the subprocess might think it wants to do with them), but I think I
might be even more annoyed if it *shutdown my sockets* than if it
caused double-flushes (there's at least a possibility that I could
cope with the double-flushes by just ensuring that *I* flushed before
the fork--not so with socket.shutdown()!)

> What this all means is that what is the appropriate thing to do is
> going to depend on the environment in which the code is used. Thus,
> having the behaviour hard wired a certain way is really bad. There
> perhaps instead should be a way of a user providing a hook function to
> be called to perform any case specific cleanup of stdin, stdout and
> stderr, or otherwise reassign them.

Usually, I'd say that that's what the methods on the passed-in object
are for. Though, as I said--the file-object API is lacking, here :(

> > As such, I'd recommend against just using .close(); you might use
> > something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
> > `else' clause unconditionally calls sys.stdin.close(), then you still
> > have double-flush problems if someone's set sys.stdin to a file-like
> > object with output-buffering.
> >
> > I guess you could try calling that an `edge-case' and seeing if anyone
> > screams. It'd be sort-of nice if there was finer granularity in the
> > file API--maybe if file.close() took a boolean `flush' argument

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Joshua Judson Rosen
Ross Ridge  writes:
>
> > It's all about declaring your charset. In Python as well as in your 
> > newsreader. If you don't declare your charset it's ASCII for you - in 
> > Python as well as in your newsreader.
> 
> Except in practice unlike Python, many newsreaders don't assume ASCII.
> The original article displayed fine for me.

Right. Exactly.

Wasn't that exact issue a driving force behind unicode's creation in
the first place? :)

To avoid horrors like this:

   http://en.wikipedia.org/wiki/File:Letter_to_Russia_with_krokozyabry.jpg

... and people getting into arguments on usenet and having to use
rebuttals like "Well, it looked fine to *me*--there's nothing wrong,
we're just using incompatible encodings!"?

But you're right--specifying in usenet-posts is like
turn-signals

Can we get back to Python programming, now? :)

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime.time and midnight

2009-02-22 Thread Joshua Judson Rosen
"D'Arcy J.M. Cain"  writes:
>
> On Sun, 22 Feb 2009 05:20:31 -0200
> "Gabriel Genellina"  wrote:
> > En Sat, 21 Feb 2009 21:55:23 -0200, MRAB   
> > escribió:
> > > I think it's because midnight is to the time of day what zero is to
> > > integers, or an empty string is to strings, or an empty container ...
> > 
> > So chr(0) should be False too...
> 
> >>> chr(0)
> '\x00'
> 
> That's not an empty string.  This is like...
> 
> >>> bool([0])
> True
> 
> Now if Python had a char type one might expect the zero char to be
> False but a collection of anything, even if the elements are False, is
> not empty and hence is True.

And, as Halmos said:

   A box that contains a hat and nothing else is not the same thing as
   a hat

:)

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


choosing a default text-encoding in Python programs (was: To unicode or not to unicode)

2009-02-22 Thread Joshua Judson Rosen
Denis Kasak  writes:
>
> > > Python "assumes" ASCII and if the decodes/encoded text doesn't
> > > fit that encoding it refuses to guess.
> >
> > Which is reasonable given that Python is programming language where it's
> > better to have more conservative assumption about encodings so errors
> > can be more quickly diagnosed.  A newsreader however is a different
> > beast, where it's better to make a less conservative assumption that's
> > more likely to display messages correctly to the user.  Assuming ISO
> > 8859-1 in the absense of any specified encoding allows the message to be
> > correctly displayed if the character set is either ISO 8859-1 or ASCII.
> > Doing things the "pythonic" way and assuming ASCII only allows such
> > messages to be displayed if ASCII is used.
>
> Reading this paragraph, I've began thinking that we've misunderstood
> each other. I agree that assuming ISO 8859-1 in the absence of
> specification is a better guess than most (since it's more likely to
> display the message correctly).

So, yeah--back on the subject of programming in Python and supporting
charactersets beyond ASCII:

If you have to make an assumption, I'd really think that it'd be
better to use whatever the host OS's default is, if the host OS has
such a thing--using an assumption of ISO 8859-1 works only in select
regions on unix systems, and may fail even in those select regions on
Windows, Mac OS, and other systems; without the OS considerations,
just the regional constraints are likely to make an ISO-8859-1
assumption result in /incorrect/ results anywhere eastward of central
Europe. Is a user in Russia (or China, or Japan) *really* most likely
to be using ISO 8859-1?

As a point of reference, here's what's in the man-pages that I have
installed (note the /complete/ and conspicuous lack of references to
even some notable eastern languages or character-sets, such as Chinese
and Japanese, in the /entire/ ISO-8859 spectrum):

   "ISO 8859 Alphabets
   The full set of ISO 8859 alphabets includes:

   ISO 8859-1West European languages (Latin-1)
   ISO 8859-2Central and East European languages (Latin-2)
   ISO 8859-3Southeast European and miscellaneous languages (Latin-3)
   ISO 8859-4Scandinavian/Baltic languages (Latin-4)
   ISO 8859-5Latin/Cyrillic
   ISO 8859-6Latin/Arabic
   ISO 8859-7Latin/Greek
   ISO 8859-8Latin/Hebrew
   ISO 8859-9Latin-1 modification for Turkish (Latin-5)
   ISO 8859-10   Lappish/Nordic/Eskimo languages (Latin-6)
   ISO 8859-11   Latin/Thai
   ISO 8859-13   Baltic Rim languages (Latin-7)
   ISO 8859-14   Celtic (Latin-8)
   ISO 8859-15   West European languages (Latin-9)
   ISO 8859-16   Romanian (Latin-10)"

   "ISO 8859-1 supports the following languages: Afrikaans, Basque,
   Catalan, Danish, Dutch, English, Faeroese, Finnish, French,
   Galician, German, Icelandic, Irish, Italian, Norwegian,
   Portuguese, Scottish, Spanish, and Swedish."

   "ISO   8859-2  supports  the  following  languages:  Albanian,  Bosnian,
   Croatian, Czech, English, Finnish, German,  Hungarian,  Irish,  Polish,
   Slovak, Slovenian and Sorbian."

   "ISO 8859-7 encodes the characters used in modern monotonic
   Greek."

   "ISO 8859-9, also known as the "Latin Alphabet No. 5", encodes
   the characters used in Turkish."

   "ISO 8859-15 supports the following languages: Albanian, Basque, Breton,
   Catalan,  Danish,  Dutch,  English, Estonian, Faroese, Finnish, French,
   Frisian,  Galician,  German,  Greenlandic,  Icelandic,  Irish   Gaelic,
   Italian,  Latin,  Luxemburgish,  Norwegian, Portuguese, Rhaeto-Romanic,
   Scottish Gaelic, Spanish, and Swedish."

   "ISO  8859-16  supports  the  following  languages:  Albanian,  Bosnian,
   Croatian, English, Finnish, German, Hungarian, Irish, Polish, Romanian,
   Slovenian and Serbian."

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime.time and midnight

2009-02-22 Thread Joshua Judson Rosen
Ethan Furman  writes:
>
> [...]partly because midnight is in fact a time of day, and not a lack of
> a time of day, I do indeed expect it to be True.

While it's not a lack of `time of day', it /is/ a lack of /elapsed/
time in the day ;)

Just as if you were using a plain integer or float to count the time :)

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python AppStore / Marketplace

2009-03-26 Thread Joshua Judson Rosen
Marcel Luethi  writes:
>
> Now I'm standing here, having this great idea for a brand new rocking
> app...
> But where do I start? I want it to be multi-platform (Linux, Mac OS X,
> Windows). It should be easy to install and upgrade. It should be self-
> contained, independent of an already installed Python. And of course -
> the world should be able to find it!
[...]
> Using my iPhone I suddenly realize how easy it is to find applications
> in Apple's AppStore. How easy and fast it is to install or de-install
> an app. My iPhone even checks in the background if there is an upgrade
> which could be installed painlessly.
[...]
> Unfortunately there's nothing like this in the Python world...

Sure there is: it's called "The Internet".

It's just that it supports more platforms (not just the iPhone), and
it's been growing for the past 30+ years (whereas the iPhone app-store
has been growing for..., what, a single year?).

What do you think the iPhone app-store will look like 30 years from
now, with 3 decades' worth of apps and 23 different iPhone models to
support? Actually, since we're just talking about a
distribution-system for apps written in Python (as if users care what
tools the app-developers used to develop their apps?), I'd be content
to hear your projection for a mere 18 years out (the amount of time
for which Python apps have been in production), or even 10 years
(which takes us back to Python 1.5).

:)

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Hash of None varies per-machine

2009-04-03 Thread Joshua Judson Rosen
Paul Rubin  writes:
>
> ben.tay...@email.com writes:
> > 1. Is it correct that if you hash two things that are not equal they
> > might give you the same hash value?
> 
> Yes, hashes are 32 bit numbers and there are far more than 2**32
> possible Python values (think of long ints), so obviously there must
> be multiple values that hash to the same slot.

This is not true. CPython integers, at least up through the 2.x
series, are implemented as C *long integers*; on some platforms, this
means that they're 32 bits long. But on an increasing number of
platforms, long integes are 64 bits long.

But, more specifically, consider the following:

> > 2. Should the hash of None vary per-machine? 
> 
> If the docs say this shouldn't happen, then it's a bug.  Otherwise,
> it should probably be considered ok.
> 
> > 3. Given that presumably not all things can be hashed (since the
> > documentation description of hash() says it gives you the hash of the
> > object "if it can be hashed"), should None be hashable?
> 
> Yes, anything that can be used as a dict key (basically all immutable
> values with equality comparison) should be hashable.

My recollection is that what you're seeing here is that, when hash()
doesn't have any `proper value' to use other than object-identity, it
just returns the result of id(). And id() is documented as:

 Return the "identity" of an object. This is an integer (or long
 integer) which is guaranteed to be unique and constant for this
 object during its lifetime. Two objects with non-overlapping
 lifetimes may have the same id() value. (Implementation note:
 this is the address of the object.)

So, not only is the return-value from id() (and hash(), if there's not
actually a __hash__ method defined) non-portable between different
machines, it's not even necessarily portable between two *runs* on the
*same* machine.

In practice, your OS will probably start each new process with the
same virtual memory-address range, and a given *build* of Python will
probably initialise the portion of its memory-segment leading up to
the None-object the same way each time, but

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Seeking old post on developers who like IDEs vs developers who like simple languages

2009-05-19 Thread Joshua Judson Rosen
Ulrich Eckhardt  writes:
> 
> That said, an IDE that provides auto-completion (e.g. that gives you a list
> of available class members) is a good thing in Java, because you don't have
> to browse the documentation as often.

While I find at least some types of autocompletion to be laudable
features, your rationale (if I'm reading it right)..., frankly,
frightens me.

You may find these interesting reading, if you haven't encountered
them previously:


http://www.cincomsmalltalk.com/userblogs/buck/blogView?showComments=true&entry=3296933922

http://www.charlespetzold.com/etc/DoesVisualStudioRotTheMind.html

Steve may find them intersting, also--I don't think that either of
them is the particular post for which he's searching, but they both
are quite related to it.

Actually: thank you, Steve, for indirectly reminding me of
`Does Visual Studio Rot the Mind'--I'd forgotten all about it.

> With Python, that is impossible because there are no types bound to
> parameters, so any type that fits is allowed (duck typing).

That does not prevent a tool from showing you the argument-list;
untyped as it is, the number, sequence, and *names* of the arguments
*are* available. And it could be argued (especially with Python's duck
typing) that the parameters' *names* are (or at least *should be*) far
more illustrative of their purpose than their type-specifications
would be.

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Overriding iadd for dictionary like objects

2009-08-28 Thread Joshua Judson Rosen
Robert Kern  writes:
>
> On 2009-08-28 16:42 PM, Terry Reedy wrote:
> > Carl Banks wrote:
> >
> > > I don't think it needs a syntax for that, but I'm not so sure a
> > > method to modify a value in place with a single key lookup
> > > wouldn't occasioanally be useful.
> >
> > Augmented assignment does that.
> 
> No, it uses one __getitem__ and one __setitem__ thus two key lookups.

Apparently you're defining "key lookup" some other way than as
`what __getitem__ does'.

What exactly does "key lookup" mean to you?

I've always understood it as `retrieving the value associated with a
key', which obviously isn't required for assignment--otherwise it
wouldn't be possible to add new keys to a mapping.

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Overriding iadd for dictionary like objects

2009-08-28 Thread Joshua Judson Rosen
Carl Banks  writes:
>
> On Aug 28, 2:42 pm, Terry Reedy  wrote:
>
> > Carl Banks wrote:
> > > I don't think it needs a syntax for that, but I'm not so sure a method
> > > to modify a value in place with a single key lookup wouldn't
> > > occasioanally be useful.
> >
> > Augmented assignment does that.
> 
> Internally uses two lookups, one for getting, and one for setting.
>
> I think this is an unavoidable given Python's semantics.  Look at
> the traceback:
> 
> 
> >>> def x():
> ... d['a'] += 1
> ...
> >>> dis.dis(x)
>   2   0 LOAD_GLOBAL  0 (d)
>   3 LOAD_CONST   1 ('a')
>   6 DUP_TOPX 2
>   9 BINARY_SUBSCR

OK, there's one lookup, but...

>  10 LOAD_CONST   2 (1)
>  13 INPLACE_ADD
>  14 ROT_THREE
>  15 STORE_SUBSCR
>  16 LOAD_CONST   0 (None)
>  19 RETURN_VALUE

... I don't see anything in there that retrieves the value a second time

> > > As a workaround, if lookups are expensive,
> >
> > But they are not. Because (C)Python is heavily based on dict name lookup
> > for builtins and global names and attributes, as well as overt dict
> > lookup, must effort has gone into optimizing dict lookup.
> 
> The actual lookup algorithm Python dicts use is well-optimized, yes,
> but the dict could contain keys that have expensive comparison and
> hash-code calculation, in which case lookup is going to be slow.

I'll like the originator correct me if I've made a mistake, but I read
"lookup" as actually meaning "lookup", not "value-comparison".

At least in part because the question, as it was posed, specifically
related to a wrapper-class (providing a mapping ("dict like") interface)
around a database of some sort other than Python's dict class per se.

How do the details of Python's native dict-type's internal (hashtable)
algorithm matter when they're explicitly /not/ being used?

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
-- 
http://mail.python.org/mailman/listinfo/python-list


Non-deterministic computing (was: What python can NOT do?)

2009-08-30 Thread Joshua Judson Rosen
Steven D'Aprano  writes:
>
> On Sat, 29 Aug 2009 05:37:34 +0200, Tomasz Rola wrote:
> 
> > My private list of things that when implemented in Python would be
> > ugly to the point of calling it difficult:
> > 
> > 1. AMB operator - my very favourite. In one sentence, either language
> > allows one to do it easily or one would not want to do it (in an ugly
> > way).
> > 
> > http://www.randomhacks.net/articles/2005/10/11/amb-operator
> 
> 
> Fascinating, but I don't exactly see how that's actually *useful*. It 
> strikes me of combining all the benefits of COME FROM with the potential 
> performance of Bogosort, but maybe I'm being harsh. 

There's a chapter on this (non-deterministic computing in general,
and `amb' in particular) in Abelson's & Sussman's book,
`Structure and Interpretation of Computer Programs':

http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-28.html#%_sec_4.3

It's an interesting read (the chapter, as well as the rest of the book).

> On the other hand, it sounds rather like Prolog-like declarative 
> programming. I fear that, like Prolog, it risks massive performance 
> degradation if you don't apply the constraints in the right order.

One of the classic arguments in the other direction is that
imperative programming (as is common in Python ;)) risks
massive *incorrect results* if you don't apply the side-effects
in the right order :)

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: match braces?

2009-09-03 Thread Joshua Judson Rosen
Grant Edwards  writes:
>
> On 2009-09-03, Ben Finney  wrote:
> > Tim Chase  writes:
> > >
> > > Any editor worth its salt will offer indentation-based folding (I know
> > > vim does, and I would be astonished if emacs didn't.
> >
> > Emacs calls that ???hide/show???, and the ???hs-minor-mode??? can
> > be enabled for any buffer (and can thus of course be automatically
> > enabled on defined conditions, e.g. whenever a Python buffer is
> > detected).
> >
> > Learn more at http://www.emacswiki.org/cgi-bin/wiki/HideShow>.
> 
> There's only one problem: it doesn't work out-of-the-box.  At
> least it never has for me.  The only thing it knows how to hide
> is the entire body of a function definition.  I never want to
> do that.  What I want to do is hide/unhide the blocks within
> if/then/else or loops so that the control flow is clearer.
> Emacs hs-minor-mode won't do that (at least not for me).

Hm. I wasn't aware of hs-minor-mode. But I've often used
set-selective-display (C-x $), which hides all lines that are indented
more than ARG columns (and un-hides them if you don't give an argument).

But to fulfill the specific request of moving up to the top of a given
block, there's also a `python-beginning-of-block' command in
python-mode (bound to C-c C-u). If you set the mark (C-SPC) before you
do python-beginning-of-block, then you can use `C-x C-x' or `C-u SPC'
to jump back where you were.


-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
-- 
http://mail.python.org/mailman/listinfo/python-list