date:20131206



Hi Chris (and Michael),

On 06/12/13 15:51, Chris Angelico wrote:

On Fri, Dec 6, 2013 at 4:16 PM, Michael Torrie  wrote:

On 12/05/2013 07:34 PM, Garthy wrote:

- My fallback if I can't do this is to implement each instance in a
dedicated *process* rather than per-thread. However, there is a
significant cost to doing this that I would rather not incur.


What cost is this? Are you speaking of cost in terms of what you the
programmer would have to do, cost in terms of setting things up and
communicating with the process, or the cost of creating a process vs a
thread?  If it's the last, on most modern OS's (particularly Linux),
it's really not that expensive.  On Linux the cost of threads and
processes are nearly the same.


If you want my guess, the cost of going to multiple processes would be
to do with passing data back and forth between them. Why is Python
being embedded in another application? Sounds like there's data moving
from C to Python to C, ergo breaking that into separate processes
means lots of IPC.


An excellent guess. :)

One characteristic of the application I am looking to embed Python in is 
that there are a fairly large number calls from the app into Python, and 
for each, generally many back to the app. There is a healthy amount of 
data flowing back and forth each time. An implementation with an 
inter-process roundtrip each time (with a different scripting language) 
proved to be too limiting, and needlessly complicated the design of the 
app. As such, more development effort has gone into making things work 
better with components that work well running across thread boundaries 
than process boundaries.


I am confident at this point I could pull things off with a Python 
one-interpreter-per-process design, but I'd then need to visit the IPC 
side of things again and put up with the limitations that arise. 
Additionally, the IPC code has has less attention and isn't as capable. 
I know roughly how I'd proceed if I went with this approach, but it is 
the least desirable outcome of the two.


However, if I could manage to get a thread-based solution going, I can 
put the effort where it is most productive, namely into making sure that 
the thread-based solution works best. This is my preferred outcome and 
current goal. :)


Cheers,
Garth


--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

Hi Gregory,

On 06/12/13 17:28, Gregory Ewing wrote:
> Garthy wrote:
>> I am running into problems when using multiple interpreters [1] and I
>> am presently trying to track down these issues. Can anyone familiar
>> with the process of embedding multiple interpreters have a skim of the
>> details below and let me know of any obvious problems?
>
> As far as I know, multiple interpreters in one process is
> not really supported. There *seems* to be partial support for
> it in the code, but there is no way to fully isolate them
> from each other.

That's not good to hear.

Is there anything confirming that it's an incomplete API insofar as 
multiple interpreters are concerned? Wouldn't this carry consequences 
for say mod_wsgi, which also does this?

> Why do you think you need multiple interpreters, as opposed
> to one interpreter with multiple threads? If you're trying
> to sandbox the threads from each other and/or from the rest
> of the system, be aware that it's extremely difficult to
> securely sandbox Python code. You'd be much safer to run
> each one in its own process and rely on OS-level protections.

To allow each script to run in its own environment, with minimal chance 
of inadvertent interaction between the environments, whilst allowing 
each script the ability to stall on conditions that will be later met by 
another thread supplying the information, and to fit in with existing 
infrastructure.

>> - I don't need to share objects between interpreters (if it is even
>> possible- I don't know).
>
> The hard part is *not* sharing objects between interpreters.
> If nothing else, all the builtin type objects, constants, etc.
> will be shared.

I understand. To clarify: I do not need to pass any Python objects I 
create or receive back and forth between different interpreters. I can 
imagine some environments would not react well to this.

Cheers,
Garth

--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

Hi all,

A small update here:

On 06/12/13 13:04, Garthy wrote:
> [1] It presently crashes in Py_EndInterpreter() after running through a
> series of tests during the shutdown of the 32nd interpreter I create. I
> don't know if this is significant, but the tests pass for the first 31
> interpreters.

This turned out to be a red herring, so please ignore this bit. I had a 
code path that failed to call Py_INCREF on Py_None which was held in a 
PyObject that was later Py_DECREF'd. This had some interesting 
consequences, and not surprisingly led to some double-frees. ;)

I was able to get much further with this fix, although I'm still having 
some trouble getting multiple interpreters running together 
simultaneously. Advice and thoughts still very much welcomed on the rest 
of the email. :)

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

Hi Chris (and Michael),

On 06/12/13 15:46, Michael Torrie wrote:
> On 12/05/2013 07:34 PM, Garthy wrote:
>> - My fallback if I can't do this is to implement each instance in a
>> dedicated *process* rather than per-thread. However, there is a
>> significant cost to doing this that I would rather not incur.
>
> What cost is this? Are you speaking of cost in terms of what you the
> programmer would have to do, cost in terms of setting things up and
> communicating with the process, or the cost of creating a process vs a
> thread?  If it's the last, on most modern OS's (particularly Linux),
> it's really not that expensive.  On Linux the cost of threads and
> processes are nearly the same.

An excellent guess. :)

One characteristic of the application I am looking to embed Python in is 
that there are a fairly large number calls from the app into Python, and 
for each, generally many back to the app. There is a healthy amount of 
data flowing back and forth each time. An implementation with an 
inter-process roundtrip each time (with a different scripting language) 
proved to be too limiting, and needlessly complicated the design of the 
app. As such, more development effort has gone into making things work 
better with components that work well running across thread boundaries 
than process boundaries.

I am confident at this point I could pull things off with a Python 
one-interpreter-per-process design, but I'd then need to visit the IPC 
side of things again and put up with the limitations that arise. 
Additionally, the IPC code has has less attention and isn't as capable. 
I know roughly how I'd proceed if I went with this approach, but it is 
the least desirable outcome of the two.

However, if I could manage to get a thread-based solution going, I can 
put the effort where it is most productive, namely into making sure that 
the thread-based solution works best. This is my preferred outcome and 
current goal. :)

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

Hi all,

A small update here:

On 06/12/13 13:04, Garthy wrote:
> [1] It presently crashes in Py_EndInterpreter() after running through a
> series of tests during the shutdown of the 32nd interpreter I create. I
> don't know if this is significant, but the tests pass for the first 31
> interpreters.

This turned out to be a red herring, so please ignore this bit. I had a 
code path that failed to call Py_INCREF on Py_None which was held in a 
PyObject that was later Py_DECREF'd. This had some interesting 
consequences, and not surprisingly led to some double-frees. ;)

I was able to get much further with this fix, although I'm still having 
some trouble getting multiple interpreters running together 
simultaneously. Advice and thoughts still very much welcomed on the rest 
of the email. :)

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

Hi Gregory,

On 06/12/13 17:28, Gregory Ewing wrote:
> Garthy wrote:
>> I am running into problems when using multiple interpreters [1] and I
>> am presently trying to track down these issues. Can anyone familiar
>> with the process of embedding multiple interpreters have a skim of the
>> details below and let me know of any obvious problems?
>
> As far as I know, multiple interpreters in one process is
> not really supported. There *seems* to be partial support for
> it in the code, but there is no way to fully isolate them
> from each other.

That's not good to hear.

Is there anything confirming that it's an incomplete API insofar as 
multiple interpreters are concerned? Wouldn't this carry consequences 
for say mod_wsgi, which also does this?

> Why do you think you need multiple interpreters, as opposed
> to one interpreter with multiple threads? If you're trying
> to sandbox the threads from each other and/or from the rest
> of the system, be aware that it's extremely difficult to
> securely sandbox Python code. You'd be much safer to run
> each one in its own process and rely on OS-level protections.

To allow each script to run in its own environment, with minimal chance 
of inadvertent interaction between the environments, whilst allowing 
each script the ability to stall on conditions that will be later met by 
another thread supplying the information, and to fit in with existing 
infrastructure.

>> - I don't need to share objects between interpreters (if it is even
>> possible- I don't know).
>
> The hard part is *not* sharing objects between interpreters.
> If nothing else, all the builtin type objects, constants, etc.
> will be shared.

I understand. To clarify: I do not need to pass any Python objects I 
create or receive back and forth between different interpreters. I can 
imagine some environments would not react well to this.

Cheers,
Garth

PS. Apologies if any of these messages come through more than once. Most 
lists that I've posted to set reply-to meaning a normal reply can be 
used, but python-list does not seem to. The replies I have sent manually 
to python-list@python.org instead don't seem to have appeared. I'm not 
quite sure what is happening- apologies for any blundering around on my 
part trying to figure it out.

--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

On Fri, Dec 6, 2013 at 6:59 PM, Garthy
 wrote:
> Hi Chris (and Michael),

Hehe. People often say that to me IRL, addressing me and my brother.
But he isn't on python-list, so you clearly mean Michael Torrie, yet
my brain still automatically thought you were addressing Michael
Angelico :)

> To allow each script to run in its own environment, with minimal chance of
> inadvertent interaction between the environments, whilst allowing each
> script the ability to stall on conditions that will be later met by another
> thread supplying the information, and to fit in with existing
> infrastructure.

Are the scripts written cooperatively, or must you isolate one from
another? If you need to isolate them for trust reasons, then there's
only one solution, and that's separate processes with completely
separate interpreters. But if you're prepared to accept that one
thread of execution is capable of mangling another's state, things are
a lot easier. You can protect against *inadvertent* interaction much
more easily than malicious interference. It may be that you can get
away with simply running multiple threads in one interpreter;
obviously that would have problems if you need more than one CPU core
between them all (hello GIL), but that would really be your first
limit. One thread could fiddle with __builtins__ or a standard module
and thus harass another thread, but you would know if that's what's
going on.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

squeeze out some performance

2013-12-06 Thread Robert Voigtländer

Hi,

I try to squeeze out some performance of the code pasted on the link below.
http://pastebin.com/gMnqprST

The code will be used to continuously analyze sonar sensor data. I set this up 
to calculate all coordinates in a sonar cone without heavy use of trigonometry 
(assuming that this way is faster in the end).

I optimized as much as I could. Maybe one of you has another bright idea to 
squeeze out a bit more?

Thanks
Robert
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module


On 06/12/2013 06:23, iMath wrote:

Dearest iMath, wouldst thou be kind enough to partake of obtaining some 
type of email client that dost not sendeth double spaced data into this 
most illustrious of mailing lists/newsgroups.  Thanking thee for thine 
participation in my most humble of requests.  I do remain your most 
obedient servant.


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

On Fri, Dec 6, 2013 at 7:21 PM, Garthy
 wrote:
> PS. Apologies if any of these messages come through more than once. Most
> lists that I've posted to set reply-to meaning a normal reply can be used,
> but python-list does not seem to. The replies I have sent manually to
> python-list@python.org instead don't seem to have appeared. I'm not quite
> sure what is happening- apologies for any blundering around on my part
> trying to figure it out.

They are coming through more than once. If you're subscribed to the
list, sending to python-list@python.org should be all you need to do -
where else are they going?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance

2013-12-06 Thread Jeremy Sanders

Robert Voigtländer wrote:

> I try to squeeze out some performance of the code pasted on the link
> below. http://pastebin.com/gMnqprST
> 
> The code will be used to continuously analyze sonar sensor data. I set
> this up to calculate all coordinates in a sonar cone without heavy use of
> trigonometry (assuming that this way is faster in the end).
> 
> I optimized as much as I could. Maybe one of you has another bright idea
> to squeeze out a bit more?

This sort of code is probably harder to make faster in pure python. You 
could try profiling it to see where the hot spots are. Perhaps the choice of 
arrays or sets might have some speed impact.

One idea would be to use something like cython to compile your python code 
to an extension module, with some hints to the types of the various values.

I would go down the geometry route. If you can restate your problem in terms 
of geometry, it might be possible to replace all that code with a few numpy 
array operations.

e.g. for finding pixels in a circle of radius 50
import numpy as np
radiussqd = np.fromfunction(lambda y,x: (y-50)**2+(x-50)**2, (100,100) )
all_y, all_x = np.indices((100,100))
yvals = all_y[radiussqd < 50**2]

Jeremy

-- 
https://mail.python.org/mailman/listinfo/python-list

ANN: eGenix PyRun - One file Python Runtime 1.3.1

2013-12-06 Thread eGenix Team: M.-A. Lemburg



ANNOUNCING

 eGenix PyRun - One file Python Runtime

Version 1.3.1


 An easy-to-use single file relocatable Python run-time -
   available for Linux, Mac OS X and Unix platforms,
  with support for Python 2.5, 2.6 and 2.7


This announcement is also available on our web-site for online reading:
http://www.egenix.com/company/news/eGenix-PyRun-1.3.0-GA.html



INTRODUCTION

Our new eGenix PyRun combines a Python interpreter with an almost
complete Python standard library into a single easy-to-use executable,
that does not require a system wide installation and is fully
relocatable.

eGenix PyRun's executable only needs 11MB, but still supports most
Python application and scripts - and it can be further compressed to
just 3-4MB using upx.

Compared to a regular Python installation of typically 100MB on disk,
this makes eGenix PyRun ideal for applications and scripts that need
to be distributed to many target machines, client installations or
customers.

It makes "installing" Python on a Unix based system as simple as
copying a single file.

We have been using the product internally in our mxODBC Connect Server
since 2008 with great success and have now extracted it into a
stand-alone open-source product.

We provide both the source archive to build your own eGenix PyRun, as
well as pre-compiled binaries for Linux, FreeBSD and Mac OS X, as 32-
and 64-bit versions. The binaries can be downloaded manually, or you
can let our automatic install script install-pyrun take care of the
installation: ./install-pyrun dir and you're done.

Please see the product page for more details:

http://www.egenix.com/products/python/PyRun/



NEWS

This is a new minor release of eGenix PyRun, which comes with updates
to the latest Python releases and includes a number of compatibility
enhancements.

New Features


 * Upgraded eGenix PyRun to work with and use Python 2.7.6 per
   default.

 * Upgraded eGenix PyRun to use Python 2.6.9 as default
   Python 2.6 version.

install-pyrun Quick Installation Enhancements
-

Since version 1.1.0, eGenix PyRun includes a shell script called
install-pyrun, which greatly simplifies installation of eGenix
PyRun. It works much like the virtualenv shell script used for
creating new virtual environments (except that there's nothing virtual
about PyRun environments).

https://downloads.egenix.com/python/install-pyrun

With the script, an eGenix PyRun installation is as simple as running:

./install-pyrun targetdir

We have updated this script since the last release:

 * install-pyrun now defaults to installing setuptools 1.4.2
   and pip 1.4.1 when looking for local downloads of these tools.

For a complete list of changes, please see the eGenix PyRun Changelog:

http://www.egenix.com/products/python/PyRun/changelog.html

For a list of changes in the 1.3.0 minor release, please read the
eGenix PyRun 1.3.0 announcement:

http://www.egenix.com/company/news/eGenix-PyRun-1.3.0-GA.html


Presentation at EuroPython 2012
---

Marc-André Lemburg, CEO of eGenix, gave a presentation about eGenix
PyRun at EuroPython 2012 last year. The talk video as well as the
slides are available on our website:

http://www.egenix.com/library/presentations/EuroPython2012-eGenix-PyRun/




LICENSE

eGenix PyRun is distributed under the eGenix.com Public License 1.1.0
which is an Open Source license similar to the Python license. You can
use eGenix PyRun in both commercial and non-commercial settings
without fee or charge.

Please see our license page for more details:

http://www.egenix.com/products/python/PyRun/license.html

The package comes with full source code.



DOWNLOADS

The download archives and instructions for installing eGenix PyRun can
be found at:

http://www.egenix.com/products/python/PyRun/

As always, we are providing pre-built binaries for all common
platforms: Windows 32/64-bit, Linux 32/64-bit, FreeBSD 32/64-bit, Mac
OS X 32/64-bit. Source code archives are available for installation on
other platforms, such as Solaris, AIX, HP-UX, etc.

___

SUPPORT

Commercial support for this product is available from eGenix.com.
Please see

http://www.egenix.com/services/support/

for details about our support offerings.



MORE INFORMATION

For more information about eGenix PyRun, licensing and download
instructions, please visit our w

Re: Embedding multiple interpreters

Hi Chris,

On 06/12/13 19:03, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 6:59 PM, Garthy
>   wrote:
>> Hi Chris (and Michael),
>
> Hehe. People often say that to me IRL, addressing me and my brother.
> But he isn't on python-list, so you clearly mean Michael Torrie, yet
> my brain still automatically thought you were addressing Michael
> Angelico :)

These strange coincidences happen from time to time- it's entertaining 
when they do. :)

>> To allow each script to run in its own environment, with minimal 
chance of

>> inadvertent interaction between the environments, whilst allowing each
>> script the ability to stall on conditions that will be later met by 
another

>> thread supplying the information, and to fit in with existing
>> infrastructure.
>
> Are the scripts written cooperatively, or must you isolate one from
> another? If you need to isolate them for trust reasons, then there's
> only one solution, and that's separate processes with completely
> separate interpreters. But if you're prepared to accept that one
> thread of execution is capable of mangling another's state, things are
> a lot easier. You can protect against *inadvertent* interaction much
> more easily than malicious interference. It may be that you can get
> away with simply running multiple threads in one interpreter;
> obviously that would have problems if you need more than one CPU core
> between them all (hello GIL), but that would really be your first
> limit. One thread could fiddle with __builtins__ or a standard module
> and thus harass another thread, but you would know if that's what's
> going on.

I think the ideal is completely sandboxed, but it's something that I 
understand I may need to make compromises on. The bare minimum would be 
protection against inadvertent interaction. Better yet would be a setup 
that made such interaction annoyingly difficult, and the ideal would be 
where it was impossible to interfere. My approaching this problem with 
interpreters was based on an assumption that it might provide a 
reasonable level of isolation- perhaps not ideal, but hopefully good enough.

The closest analogy for understanding would be browser plugins: Scripts 
from multiple authors who for the most part aren't looking to create 
deliberate incompatibilities or interference between plugins. The 
isolation is basic, and some effort is made to make sure that one plugin 
can't cripple another trivially, but the protection is not exhaustive.

Strangely enough, the GIL restriction isn't a big one in this case. For 
the application, the common case is actually one script running at a 
time, with other scripts waiting or not running at that time. They do 
sometimes overlap, but this isn't the common case. If it turned out that 
only one script could be progressing at a time, it's an annoyance but 
not a deal-breaker. If it's suboptimal (as seems to be the case), then 
it's actually not a major issue.

With the single interpreter and multiple thread approach suggested, do 
you know if this will work with threads created externally to Python, 
ie. if I can create a thread in my application as normal, and then call 
something like PyGILState_Ensure() to make sure that Python has the 
internals it needs to work with it, and then use the GIL (or similar) to 
ensure that accesses to it remain thread-safe? If the answer is yes I 
can integrate such a thing more easily as an experiment. If it requires 
calling a dedicated "control" script that feeds out threads then it 
would need a fair bit more mucking about to integrate- I'd like to avoid 
this if possible.

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

Hi Chris,

On 06/12/13 19:57, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 7:21 PM, Garthy
>   wrote:
>> PS. Apologies if any of these messages come through more than once. Most
>> lists that I've posted to set reply-to meaning a normal reply can be 
used,

>> but python-list does not seem to. The replies I have sent manually to
>> python-list@python.org instead don't seem to have appeared. I'm not 
quite

>> sure what is happening- apologies for any blundering around on my part
>> trying to figure it out.
>
> They are coming through more than once. If you're subscribed to the
> list, sending to python-list@python.org should be all you need to do -
> where else are they going?

I think I've got myself sorted out now. The mailing list settings are a 
bit different from what I am used to and I just need to reply to 
messages differently than I normally do.

First attempt for three emails each went to the wrong place, second 
attempt for each appeared to have disappeared into the ether and I 
assumed non-delivery, but I was incorrect and they all actually arrived 
along with my third attempt at each.

Apologies to all for the inadvertent noise.

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

2013-12-06 Thread Tim Golden

On 06/12/2013 09:27, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 7:21 PM, Garthy
>  wrote:
>> PS. Apologies if any of these messages come through more than once. Most
>> lists that I've posted to set reply-to meaning a normal reply can be used,
>> but python-list does not seem to. The replies I have sent manually to
>> python-list@python.org instead don't seem to have appeared. I'm not quite
>> sure what is happening- apologies for any blundering around on my part
>> trying to figure it out.
> 
> They are coming through more than once. If you're subscribed to the
> list, sending to python-list@python.org should be all you need to do -
> where else are they going?


I released a batch from the moderation queue from Garthy first thing
this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check
first as to whether they'd already got through to the list some other way.

TJG


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread Ned Batchelder


On 12/6/13 4:23 AM, Mark Lawrence wrote:

On 06/12/2013 06:23, iMath wrote:

Dearest iMath, wouldst thou be kind enough to partake of obtaining some
type of email client that dost not sendeth double spaced data into this
most illustrious of mailing lists/newsgroups.  Thanking thee for thine
participation in my most humble of requests.  I do remain your most
obedient servant.



iMath seems to be a native Chinese speaker.  I think this message, 
though amusing, will be baffling and won't have any effect...


--Ned.

--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

On Fri, Dec 6, 2013 at 8:35 PM, Garthy
 wrote:
> I think the ideal is completely sandboxed, but it's something that I
> understand I may need to make compromises on. The bare minimum would be
> protection against inadvertent interaction. Better yet would be a setup that
> made such interaction annoyingly difficult, and the ideal would be where it
> was impossible to interfere.

In Python, "impossible to interfere" is a pipe dream. There's no way
to stop Python from fiddling around with the file system, and if
ctypes is available, with memory in the running program. The only way
to engineer that kind of protection is to prevent _the whole process_
from doing those things (using OS features, not Python features),
hence the need to split the code out into another process (which might
be chrooted, might be running as a user with no privileges, etc).

A setup that makes such interaction "annoyingly difficult" is possible
as long as your users don't think Ruby. For instance:

# script1.py
import sys
sys.stdout = open("logfile", "w")
while True: print("Blah blah")

# script2.py
import sys
sys.stdout = open("otherlogfile", "w")
while True: print("Bleh bleh")

These two scripts won't play nicely together, because each has
modified global state in a different module. So you'd have to set that
as a rule. (For this specific example, you probably want to capture
stdout/stderr to some sort of global log file anyway, and/or use the
logging module, but it makes a simple example.) Most Python scripts
aren't going to do this sort of thing, or if they do, will do very
little of it. Monkey-patching other people's code is a VERY rare thing
in Python.

> The closest analogy for understanding would be browser plugins: Scripts from
> multiple authors who for the most part aren't looking to create deliberate
> incompatibilities or interference between plugins. The isolation is basic,
> and some effort is made to make sure that one plugin can't cripple another
> trivially, but the protection is not exhaustive.

Browser plugins probably need a lot more protection - maybe it's not
exhaustive, but any time someone finds a way for one plugin to affect
another, the plugin / browser authors are going to treat it as a bug.
If I understand you, though, this is more akin to having two forms on
one page and having JS validation code for each. It's trivially easy
for one to check the other's form objects, but quite simple to avoid
too, so for the sake of encapsulation you simply stay safe.

> With the single interpreter and multiple thread approach suggested, do you
> know if this will work with threads created externally to Python, ie. if I
> can create a thread in my application as normal, and then call something
> like PyGILState_Ensure() to make sure that Python has the internals it needs
> to work with it, and then use the GIL (or similar) to ensure that accesses
> to it remain thread-safe?

Now that's something I can't help with. The only time I embedded
Python seriously was a one-Python-per-process system (arbitrary number
of processes fork()ed from one master, but each process had exactly
one Python environment and exactly one database connection, etc), and
I ended up being unable to make it secure, so I had to switch to
embedding ECMAScript (V8, specifically, as it happens... I'm morbidly
curious what my boss plans to do, now that he's fired me; he hinted at
rewriting the C++ engine in PHP, and I'd love to be a fly on the wall
as he tries to test a PHP extension for V8 and figure out whether or
not he can trust arbitrary third-party compiled code). But there'll be
someone on this list who's done threads and embedded Python.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance

On Fri, Dec 6, 2013 at 8:46 PM, Jeremy Sanders wrote:
> This sort of code is probably harder to make faster in pure python. You
> could try profiling it to see where the hot spots are. Perhaps the choice of
> arrays or sets might have some speed impact.

I'd make this recommendation MUCH stronger.

Rule 1 of optimization: Don't.
Rule 2 (for experts only): Don't yet.

Once you find that your program actually is running too slowly, then
AND ONLY THEN do you start looking at tightening something up. You'll
be amazed how little you need to change; start with good clean
idiomatic code, and then if it takes too long, you tweak just a couple
of things and it's fast enough. And when you do come to the
tweaking...

Rule 3: Measure twice, cut once.
Rule 4: Actually, measure twenty times, cut once.

Profile your code to find out what's actually slow. This is very
important. Here's an example from a real application (not in Python,
it's in a semantically-similar language called Pike):

https://github.com/Rosuav/Gypsum/blob/d9907e1507c52189c83ae25f5d7be85235b616fa/window.pike

I noticed that I could saturate one CPU core by typing commands very
quickly. Okay. That gets us past the first two rules (it's a MUD
client, it should not be able to saturate one core of an i5). The code
looks roughly like this:

paint():
for line in lines:
if line_is_visible:
paint_line(line)

paint_line():
for piece_of_text in text:
if highlighted: draw_highlighted()
else: draw_not_highlighted()

My first guess was that the actual drawing was taking the time, since
that's a whole lot of GTK calls. But no; the actual problem was the
iteration across all lines and then finding out if they're visible or
not (possibly because it obliterates the CPU caches). Once the
scrollback got to a million lines or so, that was prohibitively
expensive. I didn't realize that until I actually profiled the code
and _measured_ where the time was being spent.

How fast does your code run? How fast do you need it to run? Lots of
optimization questions are answered by "Yaknow what, it don't even
matter", unless you're running in a tight loop, or on a
microcontroller, or something. Halving the time taken sounds great
until you see that it's currently taking 0.0001 seconds and happens in
response to user action.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

[newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Jean Dubois

I'm trying out Tkinter with the (non object oriented) code fragment below:
It works partially as I expected, but I thought that pressing "1" would
cause the program to quit, however I get this message:
TypeError: quit() takes no arguments (1 given), I tried changing quit to quit()
but that makes things even worse. So my question: can anyone here help me
debug this?

#!/usr/bin/env python
import Tkinter as tk
def quit():
sys.exit()
root = tk.Tk()
label = tk.Label(root, text="Hello, world")
label.pack()
label.bind("<1>", quit)
root.mainloop()

p.s. I like the code not object orientated
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Jean-Michel Pichavant

- Original Message -
> I'm trying out Tkinter with the (non object oriented) code fragment
> below:
> It works partially as I expected, but I thought that pressing "1"
> would
> cause the program to quit, however I get this message:
> TypeError: quit() takes no arguments (1 given), I tried changing quit
> to quit()
> but that makes things even worse. So my question: can anyone here
> help me
> debug this?
> 
> #!/usr/bin/env python
> import Tkinter as tk
> def quit():
> sys.exit()
> root = tk.Tk()
> label = tk.Label(root, text="Hello, world")
> label.pack()
> label.bind("<1>", quit)
> root.mainloop()
> 
> p.s. I like the code not object orientated
> --
> https://mail.python.org/mailman/listinfo/python-list
> 

the engine is probably passing an argument to your quit callback method.

try  

def quit(param):
  sys.exit(str(param))

You probably don't even care about the parameter:

def quit(param):
  sys.exit()

JM


-- IMPORTANT NOTICE: 

The contents of this email and any attachments are confidential and may also be 
privileged. If you are not the intended recipient, please notify the sender 
immediately and do not disclose the contents to any other person, use it for 
any purpose, or store or copy the information in any medium. Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Daniel Watkins

Hi Jean,

On Fri, Dec 06, 2013 at 04:24:59AM -0800, Jean Dubois wrote:
> I'm trying out Tkinter with the (non object oriented) code fragment below:
> It works partially as I expected, but I thought that pressing "1" would
> cause the program to quit, however I get this message:
> TypeError: quit() takes no arguments (1 given), I tried changing quit to 
> quit()
> but that makes things even worse. So my question: can anyone here help me
> debug this?

I don't know the details of the Tkinter library, but you could find out
what quit is being passed by modifying it to take a single parameter and
printing it out (or using pdb):

def quit(param):
print(param)
sys.exit()

Having taken a quick look at the documentation, it looks like event
handlers (like your quit function) are passed the event that triggered
them.  So you can probably just ignore the parameter:

def quit(_):
sys.exit()


Cheers,

Dan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches

On Friday, December 6, 2013 1:06:30 PM UTC+5:30, Roy Smith wrote:
>  Rusi  wrote:

> > On Thursday, December 5, 2013 6:28:54 AM UTC+5:30, Roy Smith wrote:

> > > The real problem with web forums is they conflate transport and 
> > > presentation into a single opaque blob, and are pretty much universally 
> > > designed to be a closed system.  Mail and usenet were both engineered to 
> > > make a sharp division between transport and presentation, which meant it 
> > > was possible to evolve each at their own pace.
> > > Mostly that meant people could go off and develop new client 
> > > applications which interoperated with the existing system.  But, it also 
> > > meant that transport layers could be switched out (as when NNTP 
> > > gradually, but inexorably, replaced UUCP as the primary usenet transport 
> > > layer).
> > There is a deep assumption hovering round-about the above -- what I
> > will call the 'Unix assumption(s)'.

> It has nothing to do with Unix.  The separation of transport from 
> presentation is just as valid on Windows, Mac, etc.

> > But before that, just a check on
> > terminology. By 'presentation' you mean what people normally call
> > 'mail-clients': thunderbird, mutt etc. And by 'transport' you mean
> > sendmail, exim, qmail etc etc -- what normally are called
> > 'mail-servers.'  Right??

> Yes.

> > Assuming this is the intended meaning of the terminology (yeah its
> > clearer terminology than the usual and yeah Im also a 'Unix-guy'),
> > here's the 'Unix-assumption':
> >   - human communication�
> > (is not very different from)
> >   - machine communication�
> > (can be done by)
> >   - text�
> > (for which)
> >   - ASCII is fine�
> > (which is just)
> >   - bytes�
> > (inside/between byte-memory-organized)
> >   - von Neumann computers
> > To the extent that these assumptions are invalid, the 'opaque-blob'
> > may well be preferable.

> I think you're off on the wrong track here.  This has nothing to do with 
> plain text (ascii or otherwise).  It has to do with divorcing how you 
> store and transport messages (be they plain text, HTML, or whatever) 
> from how a user interacts with them.

Evidently (and completely inadvertently) this exchange has just
illustrated one of the inadmissable assumptions:

"unicode as a medium is universal in the same way that ASCII used to be"

I wrote a number of ellipsis characters ie codepoint 2026 as in:

  - human communication…
(is not very different from)
  - machine communication… 

Somewhere between my sending and your quoting those ellipses became
the replacement character FFFD

> >   - human communication�
> > (is not very different from)
> >   - machine communication�

Leaving aside whose fault this is (very likely buggy google groups),
this mojibaking cannot happen if the assumption "All text is ASCII"
were to uniformly hold.

Of course with unicode also this can be made to not happen, but that
is fragile and error-prone.  And that is because ASCII (not extended)
is ONE thing in a way that unicode is hopelessly a motley inconsistent
variety.

With unicode there are in-memory formats, transportation formats eg
UTF-8, strange beasties like FSR (which then hopelessly and
inveterately tickle our resident trolls!) multi-layer encodings (in
html), BOMS and unnecessary/inconsistent BOMS (in microsoft-notepad).
With ASCII, ASCII is ASCII; ie "ABC" is 65,66,67 whether its in-core,
in-file, in-pipe or whatever.  Ok there are a few wrinkles to this
eg. the null-terminator in C-strings. I think this is the exception to
the rule that in classic Unix, ASCII is completely inter-operable and
therefore a universal data-structure for inter-process or inter-machine
communication.

It is this universal data structure that makes classic unix pipes and
filters possible and easy (of which your separation of presentation
and transportation is just one case).

Give it up and the composability goes with it.

Go up from the ASCII -> Unicode level to the plain-text -> hypertext
(aka html) level and these composability problems hit with redoubled
force.

> Take something like Wikipedia (by which, I really mean, MediaWiki, which 
> is the underlying software package).  Most people think of Wikipedia as 
> a web site.  But, there's another layer below that which lets you get 
> access to the contents of articles, navigate all the rich connections 
> like category trees, and all sorts of metadata like edit histories.  
> Which means, if I wanted to (and many examples of this exist), I can 
> write my own client which presents the same information in different 
> ways.

Not sure whats your point.
Html is a universal data-structuring format -- ok for presentation, bad for
data-structuring
SQL databases (assuming thats the mediawiki backend) is another -- ok for 
data-structuring bad for presentation.

Mediawiki mediates between the two formats.

Beyond that I lost you... what are you trying to say??
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Jean Dubois

Op vrijdag 6 december 2013 13:30:53 UTC+1 schreef Daniel Watkins:
> Hi Jean,
> 
> 
> 
> On Fri, Dec 06, 2013 at 04:24:59AM -0800, Jean Dubois wrote:
> 
> > I'm trying out Tkinter with the (non object oriented) code fragment below:
> 
> > It works partially as I expected, but I thought that pressing "1" would
> 
> > cause the program to quit, however I get this message:
> 
> > TypeError: quit() takes no arguments (1 given), I tried changing quit to 
> > quit()
> 
> > but that makes things even worse. So my question: can anyone here help me
> 
> > debug this?
> 
> 
> 
> I don't know the details of the Tkinter library, but you could find out
> 
> what quit is being passed by modifying it to take a single parameter and
> 
> printing it out (or using pdb):
> 
> 
> 
> def quit(param):
> 
> print(param)
> 
> sys.exit()
> 
> 
> 
> Having taken a quick look at the documentation, it looks like event
> 
> handlers (like your quit function) are passed the event that triggered
> 
> them.  So you can probably just ignore the parameter:
> 
> 
> 
> def quit(_):
> 
> sys.exit()
> 
> 
> 
> 
> 
> Cheers,
> 
> 
> 
> Dan

I tried out your suggestions and discovered that I had the line
import sys to the program. So you can see below what I came up with.
It works but it's not all clear to me. Can you tell me what "label.bind("<1>", 
quit)" is standing for? What's the <1> meaning?



#!/usr/bin/env python
import Tkinter as tk
import sys
#underscore is necessary in the following line
def quit(_):
sys.exit()
root = tk.Tk()
label = tk.Label(root, text="Click mouse here to quit")
label.pack()
label.bind("<1>", quit)
root.mainloop()

thanks
jean


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches

On Sat, Dec 7, 2013 at 12:03 AM, rusi  wrote:
> SQL databases (assuming thats the mediawiki backend) is another -- ok for
> data-structuring bad for presentation.

No, SQL databases don't store structured text. MediaWiki just stores a
single blob (not in the database sense of that word) of text.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches

On Friday, December 6, 2013 6:49:04 PM UTC+5:30, Chris Angelico wrote:
> On Sat, Dec 7, 2013 at 12:03 AM, rusi wrote:
> > SQL databases (assuming thats the mediawiki backend) is another -- ok for
> > data-structuring bad for presentation.

> No, SQL databases don't store structured text. MediaWiki just stores a
> single blob (not in the database sense of that word) of text.

I guess we are using 'structured' in different ways.  All I am saying
is that mediawiki which seems to present as html, actually stores its
stuff as SQL -- nothing more or less structured than the schemas here:
http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches

On Sat, Dec 7, 2013 at 12:32 AM, rusi  wrote:
> I guess we are using 'structured' in different ways.  All I am saying
> is that mediawiki which seems to present as html, actually stores its
> stuff as SQL -- nothing more or less structured than the schemas here:
> http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage

Yeah, but the structure is all about the metadata. Ultimately, there's
one single text field containing the entire content as you would see
it in the page editor: wiki markup in straight text. MediaWiki uses an
SQL database to store that lump of text, but ultimately the
relationship is between wikitext and HTML, no SQL involvement.

Wiki markup is reasonable for text structuring. (Not for generic data
structuring, but it's decent for text.) Same with reStructuredText,
used for PEPs. An SQL database is a good way to store mappings of
"this key, this tuple of data" and retrieve them conveniently,
including (and this is the bit that's more complicated in a straight
Python dictionary) using any value out of the tuple as the key, and
(and this is where a dict *really* can't hack it) storing/retrieving
more data than fits in memory. The two are orthogonal. Your point is
better supported by wikitext than by SQL, here, except that there
aren't fifty other systems that parse and display wikitext. In fact,
what you're suggesting is a good argument for deprecating HTML email
in favour of RST email, and using docutils to render the result either
as HTML (for webmail users) or as some other format. And I wouldn't be
against that :) But good luck convincing the world that Microsoft
Outlook is doing the wrong thing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Sharing Python installation between architectures

2013-12-06 Thread Albert van der Horst

In article ,
Paul Smith   wrote:
>One thing I always liked about Perl was the way you can create a single
>installation directory which can be shared between archictures.  Say
>what you will about the language: the Porters have an enormous amount of
>experience and expertise producing portable and flexible interpreter
>installations.
>
>By this I mean, basically, multiple architectures (Linux, Solaris,
>MacOSX, even Windows) sharing the same $prefix/lib/python2.7 directory.
>The large majority of the contents there are completely portable across
>architectures (aren't they?) so why should I have to duplicate many
>megabytes worth of files?

The solution is of course to replace all duplicates by hard links.
A tool for this is useful in a lot of other circumstances too.
In a re-installation of the whole or parts, the hard links
will be removed, and the actual files are only removed if they aren't needed
for any of the installations, so this is transparent for reinstallation.
After a lot of reinstallation you want to run the tool again.

This is of course only possible on real file systems (probably not on FAT),
but your files reside on a server, so chances are they are on a real file
system.

(The above is partly in jest. It is a real solution to storage problems,
but storage problems are unheard of in these days of Tera byte disks.
It doesn't help with the clutter, which was probably the main motivation.)

Symbolic links are not as transparent, but they may work very well too.
Have the common part set apart and replace everything else by symbolic links.

There is always one more way to skin a cat.

Groetjes Albert
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters



Hi Chris,

On 06/12/13 22:27, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 8:35 PM, Garthy
>   wrote:
>> I think the ideal is completely sandboxed, but it's something that I
>> understand I may need to make compromises on. The bare minimum would be
>> protection against inadvertent interaction. Better yet would be a 
setup that
>> made such interaction annoyingly difficult, and the ideal would be 
where it

>> was impossible to interfere.
>
> In Python, "impossible to interfere" is a pipe dream. There's no way
> to stop Python from fiddling around with the file system, and if
> ctypes is available, with memory in the running program. The only way
> to engineer that kind of protection is to prevent _the whole process_
> from doing those things (using OS features, not Python features),
> hence the need to split the code out into another process (which might
> be chrooted, might be running as a user with no privileges, etc).

Absolutely- it would be an impractical ideal. If it was my highest and 
only priority, CPython might not be the best place to start. But there 
are plenty of other factors that make Python very desirable to use 
regardless. :) Re file and ctype-style functionality, that is something 
I'm going to have to find a way to limit somewhat. But first things 
first: I need to see what I can accomplish re initial embedding with a 
reasonable amount of work.


> A setup that makes such interaction "annoyingly difficult" is possible
> as long as your users don't think Ruby. For instance:
>
> # script1.py
> import sys
> sys.stdout = open("logfile", "w")
> while True: print("Blah blah")
>
> # script2.py
> import sys
> sys.stdout = open("otherlogfile", "w")
> while True: print("Bleh bleh")
>
>
> These two scripts won't play nicely together, because each has
> modified global state in a different module. So you'd have to set that
> as a rule. (For this specific example, you probably want to capture
> stdout/stderr to some sort of global log file anyway, and/or use the
> logging module, but it makes a simple example.)

Thanks for the example. Hopefully I can minimise the cases where this 
would potentially be a problem. Modifying the basic environment and the 
source is something I can do readily if needed.


Re stdout/stderr, on that subject I actually wrote a replacement log 
catcher for embedded Python a few years back. I can't remember how on 
earth I did it now, but I've still got the code that did it somewhere.


> Most Python scripts
> aren't going to do this sort of thing, or if they do, will do very
> little of it. Monkey-patching other people's code is a VERY rare thing
> in Python.

That's good to hear. :)

>> The closest analogy for understanding would be browser plugins: 
Scripts from
>> multiple authors who for the most part aren't looking to create 
deliberate
>> incompatibilities or interference between plugins. The isolation is 
basic,
>> and some effort is made to make sure that one plugin can't cripple 
another

>> trivially, but the protection is not exhaustive.
>
> Browser plugins probably need a lot more protection - maybe it's not
> exhaustive, but any time someone finds a way for one plugin to affect
> another, the plugin / browser authors are going to treat it as a bug.
> If I understand you, though, this is more akin to having two forms on
> one page and having JS validation code for each. It's trivially easy
> for one to check the other's form objects, but quite simple to avoid
> too, so for the sake of encapsulation you simply stay safe.

There have been cases where browser plugins have played funny games to 
mess with the behaviour of other plugins (eg. one plugin removing 
entries from the configuration of another). It's certainly not ideal, 
but it comes from the environment being not entirely locked down, and 
one plugin author being inclined enough to make destructive changes that 
impact another. I think the right effort/reward ratio will mean I end up 
in a similar place.


I know it's not the best analogy, but it was one that readily came to 
mind. :)


>> With the single interpreter and multiple thread approach suggested, 
do you
>> know if this will work with threads created externally to Python, 
ie. if I

>> can create a thread in my application as normal, and then call something
>> like PyGILState_Ensure() to make sure that Python has the internals 
it needs
>> to work with it, and then use the GIL (or similar) to ensure that 
accesses

>> to it remain thread-safe?
>
> Now that's something I can't help with. The only time I embedded
> Python seriously was a one-Python-per-process system (arbitrary number
> of processes fork()ed from one master, but each process had exactly
> one Python environment and exactly one database connection, etc), and
> I ended up being unable to make it secure, so I had to switch to
> embedding ECMAScript (V8, specifically, as it happens... I'm morbidly
> curious what my boss plans to do, now that he's fired me; he hinted at
> rewri

Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Jean-Michel Pichavant

> I tried out your suggestions and discovered that I had the line
> import sys to the program. So you can see below what I came up with.
> It works but it's not all clear to me. Can you tell me what
> "label.bind("<1>", quit)" is standing for? What's the <1> meaning?
> 
> 
> 
> #!/usr/bin/env python
> import Tkinter as tk
> import sys
> #underscore is necessary in the following line
> def quit(_):
> sys.exit()
> root = tk.Tk()
> label = tk.Label(root, text="Click mouse here to quit")
> label.pack()
> label.bind("<1>", quit)
> root.mainloop()
> 
> thanks
> jean

The best thing to do would be to read
http://effbot.org/tkinterbook/tkinter-events-and-bindings.htm


"<1>" is the identifier for you mouse button 1.
quit is the callback called by the label upon receiving the event mouse1 click.

Note that the parameter given to your quit callback is the event.

JM




-- IMPORTANT NOTICE: 

The contents of this email and any attachments are confidential and may also be 
privileged. If you are not the intended recipient, please notify the sender 
immediately and do not disclose the contents to any other person, use it for 
any purpose, or store or copy the information in any medium. Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters



Hi Tim,

On 06/12/13 20:47, Tim Golden wrote:

On 06/12/2013 09:27, Chris Angelico wrote:

On Fri, Dec 6, 2013 at 7:21 PM, Garthy
  wrote:

PS. Apologies if any of these messages come through more than once. Most
lists that I've posted to set reply-to meaning a normal reply can be used,
but python-list does not seem to. The replies I have sent manually to
python-list@python.org instead don't seem to have appeared. I'm not quite
sure what is happening- apologies for any blundering around on my part
trying to figure it out.


They are coming through more than once. If you're subscribed to the
list, sending to python-list@python.org should be all you need to do -
where else are they going?



I released a batch from the moderation queue from Garthy first thing
this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check
first as to whether they'd already got through to the list some other way.


I had to make a call between re-sending posts that might have gone 
missing, or seemingly not responding promptly when people had taken the 
time to answer my complex query. I made a call to re-send, and it was 
the wrong one. The fault for the double-posting is entirely mine.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list

Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?

2013-12-06 Thread Neil Cerutti

On 2013-12-04, Piotr Dobrogost
 wrote:
> On Wednesday, December 4, 2013 10:41:49 PM UTC+1, Neil Cerutti
> wrote:
>> not something to do commonly. Your proposed syntax leaves the
>> distinction between valid and invalid identifiers a problem
>> the programmer has to deal with. It doesn't unify access to
>> attributes the way the getattr and setattr do.
>
> Taking into account that obj.'x' would be equivalent to obj.x
> any attribute can be accessed with the new syntax. I don't see
> how this is not unified access compared to using getattr
> instead dot...

I thought of that argument later the next day. Your proposal does
unify access if the old obj.x syntax is removed.

-- 
Neil Cerutti

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches

On Friday, December 6, 2013 7:18:19 PM UTC+5:30, Chris Angelico wrote:
> On Sat, Dec 7, 2013 at 12:32 AM, rusi  wrote:
> > I guess we are using 'structured' in different ways.  All I am saying
> > is that mediawiki which seems to present as html, actually stores its
> > stuff as SQL -- nothing more or less structured than the schemas here:
> > http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage

> Yeah, but the structure is all about the metadata.

Ok (I'd drop the 'all')

> Ultimately, there's one single text field containing the entire content

Right

> as you would see it in the page editor: wiki markup in straight text.

Aha! There you are! Its 'page editor' here and not the html which
'display source' (control-u) which a browser would show. And wikimedia
is the software that mediates.

The usual direction (seen by users of wikipedia) is that wikimedia
takes this text, along with the other unrelated (metadata?) seen
around -- sidebar, tabs etc, css settings and munges it all into html

The other direction (seen by editors of wikipedia) is that you edit a
page and that page and history etc will show the changes,
reflecting the fact that the SQL content has changed.

> MediaWiki uses an SQL database to store that lump of text, but
> ultimately the relationship is between wikitext and HTML, no SQL
> involvement.


Dunno what you mean. Every time someone browses wikipedia, things are
getting pulled out of the SQL and munged into the html (s)he sees.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches

On Sat, Dec 7, 2013 at 1:11 AM, rusi  wrote:
> Aha! There you are! Its 'page editor' here and not the html which
> 'display source' (control-u) which a browser would show. And wikimedia
> is the software that mediates.
>
> The usual direction (seen by users of wikipedia) is that wikimedia
> takes this text, along with the other unrelated (metadata?) seen
> around -- sidebar, tabs etc, css settings and munges it all into html
>
> The other direction (seen by editors of wikipedia) is that you edit a
> page and that page and history etc will show the changes,
> reflecting the fact that the SQL content has changed.

MediaWiki is fundamentally very similar to a structure that I'm trying
to deploy for a community web site that I host, approximately thus:

* A git repository stores a bunch of RST files
* A script auto-generates index files based on the presence of certain
file names, and renders via rst2html
* The HTML pages are served as static content

MediaWiki is like this:

* Each page has a history, represented by a series of state snapshots
of wikitext
* On display, the wikitext is converted to HTML and served.

The main difference is that MediaWiki is optimized for rapid and
constant editing, where what I'm pushing for is optimized for less
common edits that might span multiple files. (MW has no facility for
atomically changing multiple pages, and atomically reverting those
changes, and so on. Each page stands alone.) They're still broadly
doing the same thing: storing marked-up text and rendering HTML. The
fact that one uses an SQL database and the other uses a git repository
is actually quite insignificant - it's as significant as the choice of
whether to store your data on a hard disk or an SSD. The system is no
different.

>> MediaWiki uses an SQL database to store that lump of text, but
>> ultimately the relationship is between wikitext and HTML, no SQL
>> involvement.
>
> Dunno what you mean. Every time someone browses wikipedia, things are
> getting pulled out of the SQL and munged into the html (s)he sees.

Yes, but that's just mechanics. The fact that the PHP scripts to
operate Wikipedia are being pulled off a file system doesn't mean that
MediaWiki is an ext3-to-HTML renderer. It's a wikitext-to-HTML
renderer.

Anyway. As I said, your point is still mostly there, as long as you
use wikitext rather than SQL.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread iMath

在 2013年12月6日星期五UTC+8下午5时23分59秒，Mark Lawrence写道：
> On 06/12/2013 06:23, iMath wrote:
> 
> 
> 
> Dearest iMath, wouldst thou be kind enough to partake of obtaining some 
> 
> type of email client that dost not sendeth double spaced data into this 
> 
> most illustrious of mailing lists/newsgroups.  Thanking thee for thine 
> 
> participation in my most humble of requests.  I do remain your most 
> 
> obedient servant.
> 
> 
> 
> -- 
> 
> My fellow Pythonistas, ask not what our language can do for you, ask 
> 
> what you can do for our language.
> 
> 
> 
> Mark Lawrence

yes ,I am a native Chinese speaker.I always post question by Google Group not 
through  email ,is there something wrong with it ?
your english is a little strange to me .
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread iMath

在 2013年12月4日星期三UTC+8下午6时51分49秒，Chris Angelico写道：
> On Wed, Dec 4, 2013 at 8:38 PM, Andreas Perstinger  
> wrote:
> 
> > "fp" is a file object, but subprocess expects a list of strings as
> 
> > its first argument.
> 
> 
> 
> More fundamentally: The subprocess's arguments must include the *name*
> 
> of the file. This means you can't use TemporaryFile at all, as it's
> 
> not guaranteed to return an object that actually has a file name.
> 
> 
> 
> There's another problem, too, and that's that you're not closing the
> 
> file before expecting the subprocess to open it. And once you do that,
> 
> you'll find that the file no longer exists once it's been closed. In
> 
> fact, you'll need to research the tempfile module a bit to be able to
> 
> do what you want here; rather than spoon-feed you an exact solution,
> 
> I'll just say that there is one, and it can be found here:
> 
> 
> 
> http://docs.python.org/3.3/library/tempfile.html
> 
> 
> 
> ChrisA

I think you mean I should create a temporary file by NamedTemporaryFile(). 
After tried it many times, I found there is nearly no convenience in creating a 
temporary file or a persistent one here ,because we couldn't use the temporary 
file while it has not been closed ,so we couldn't depend on the convenience of 
letting the temporary file automatically delete itself when closing, we have to 
delete it later by os.remove() after it has been used in that command line.

code without the with statement is here ,but it is wrong ,it shows this line 

c:\docume~1\admini~1\locals~1\temp\tmp0d8959: Invalid data found when 
processing input


fp=tempfile.NamedTemporaryFile(delete=False)
fp.write(("file '"+fileName1+"'\n").encode('utf-8')) 
fp.write(("file '"+fileName2+"'\n").encode('utf-8')) 


subprocess.call(['ffmpeg', '-f', 'concat','-i',fp.name, '-c',  'copy', 
fileName])
fp.close()
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

On Sat, Dec 7, 2013 at 1:54 AM, iMath  wrote:
> fp=tempfile.NamedTemporaryFile(delete=False)
> fp.write(("file '"+fileName1+"'\n").encode('utf-8'))
> fp.write(("file '"+fileName2+"'\n").encode('utf-8'))
>
>
> subprocess.call(['ffmpeg', '-f', 'concat','-i',fp.name, '-c',  'copy', 
> fileName])
> fp.close()

You need to close the file before getting the other process to use it.
Otherwise, it may not be able to open the file at all, and even if it
can, you might find that not all the data has been written.

But congrats! You have successfully found the points I was directing
you to. Yes, I was hinting that you need NamedTemporaryFile, the .name
attribute, and delete=False. Good job!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module


On 06/12/2013 14:52, iMath wrote:

在 2013年12月6日星期五UTC+8下午5时23分59秒，Mark Lawrence写道：

On 06/12/2013 06:23, iMath wrote:



Dearest iMath, wouldst thou be kind enough to partake of obtaining some

type of email client that dost not sendeth double spaced data into this

most illustrious of mailing lists/newsgroups.  Thanking thee for thine

participation in my most humble of requests.  I do remain your most

obedient servant.



--

My fellow Pythonistas, ask not what our language can do for you, ask

what you can do for our language.



Mark Lawrence


yes ,I am a native Chinese speaker.I always post question by Google Group not 
through  email ,is there something wrong with it ?
your english is a little strange to me .



You can see the extra lines inserted by google groups above.  It's not 
too bad in one and only one message, but when a message has been 
backwards and forwards several times it's extremely irritating, or worse 
still effectively unreadable.  Work arounds have been posted on this 
list, but I'd recommend using any decent email client.


The English I used was archaic, please ignore it :)

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

On Friday, December 6, 2013 8:22:48 PM UTC+5:30, iMath wrote:
> 在 2013年12月6日星期五UTC+8下午5时23分59秒，Mark Lawrence写道：
> > On 06/12/2013 06:23, iMath wrote:
> > Dearest iMath, wouldst thou be kind enough to partake of obtaining some 
> > type of email client that dost not sendeth double spaced data into this 
> > most illustrious of mailing lists/newsgroups.  Thanking thee for thine 
> > participation in my most humble of requests.  I do remain your most 
> > obedient servant.

> yes ,I am a native Chinese speaker.I always post question by Google Group not 
> through  email ,is there something wrong with it ?

Yes but its easily correctable

I recently answered this question to another poster here

https://groups.google.com/forum/#!searchin/comp.lang.python/rusi$20google$20groups|sort:date/comp.lang.python/C51hEvi-KbY/KSeaMFoHtcIJ
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

On Friday, December 6, 2013 8:42:02 PM UTC+5:30, Mark Lawrence wrote:
> The English I used was archaic, please ignore it :)

"Archaic" is almost archaic
"Old" is ever-young

:D
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote:

> yes ,I am a native Chinese speaker.I always post question by Google
> Group not through  email ,is there something wrong with it ? your
> english is a little strange to me .

Mark is writing in fake old-English style, the way people think English 
was spoken a thousand years ago. I don't know why he did that. Perhaps he 
thought it was amusing.

There are many problems with Google Groups. If you pay attention to this 
forum, you will see dozens of posts about "Managing Google Groups 
headaches" and other complaints:

- Google Groups double-spaces replies, so text which should appear like:

line one
line two
line three
line four

  turns into:

line one
blank line
line two
blank line
line three
blank line
line four

- Google Groups often starts sending HTML code instead of plain text

- it often mangles indentation, which is terrible for Python code

- sometimes it automatically sets the reply address for posts to go
  to Google Groups, instead of the mailing list it should go to

- almost all of the spam on his forum comes from Google Groups, so many 
  people automatically filter everything from Google Groups straight to
  the trash.

There are alternatives to Google Groups:

- the mailing list, python-list@python.org

- Usenet, comp.lang.python

- the Gmane mirror:

  http://gmane.org/find.php?list=python-list%40python.org

and possibly others. You will maximise the number of people reading your 
posts if you avoid Google Groups. If for some reason you cannot use any 
of the alternatives, please take the time to fix some of the problems 
with Google Groups. If you search the archives, you should find some 
posts by Rusi defending Google Groups and explaining what he does to make 
it more presentable, and (if I remember correctly) I think Mark also 
sometimes posts a link to managing Google Groups.

-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Packaging a proprietary Python library for multiple OSs

2013-12-06 Thread Kevin Walzer


On 12/5/13, 10:50 AM, Michael Herrmann wrote:

On Thursday, December 5, 2013 4:26:40 PM UTC+1, Kevin Walzer wrote:

On 12/5/13, 5:14 AM, Michael Herrmann wrote:
If your library and their dependencies are simply .pyc files, then I
don't see why a zip collated via py2exe wouldn't work on other
platforms. Obviously this point is moot if your library includes true
compiled (C-based) extensions.


As I said, I need to make my *build* platform-independent.


Giving this further thought, I'm wondering how hard it would be to roll 
your own using modulefinder, Python's zip tools, and some custom code. 
Just sayin'.


--Kevin


--
Kevin Walzer
Code by Kevin/Mobile Code by Kevin
http://www.codebykevin.com
http://www.wtmobilesoftware.com
--
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module


On 06/12/2013 15:34, Steven D'Aprano wrote:
(if I remember correctly) I think Mark also

sometimes posts a link to managing Google Groups.



You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

On Friday, December 6, 2013 9:23:47 PM UTC+5:30, Mark Lawrence wrote:
> On 06/12/2013 15:34, Steven D'Aprano wrote:
> 
> (if I remember correctly) I think Mark also
> 
> > sometimes posts a link to managing Google Groups.
> 
> >
> 
> You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython

That link needs updating.

Even if my almost-automatic correction methods are not considered
kosher for some reason or other, the thing that needs to go in there
is that GG has TWO problems

1. Blank lines
2. Long lines

That link only describes 1.

Roy's yesterday's post in "Packaging a proprietary python library"
says:

> I, and Rusi, know enough, and take the effort, to overcome its
> shortcomings doesn't change that.

But in fact his post takes care of 1 not 2.

In all fairness I did not know that 2 is a problem until rurpy pointed
it out recently and was not correcting it. In fact, I'd take the
trouble to make the lines long assuming that clients were intelligent
enough to fit it properly into whatever was the current window!!!

So someone please update that page!
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module


On 06/12/2013 16:19, rusi wrote:

On Friday, December 6, 2013 9:23:47 PM UTC+5:30, Mark Lawrence wrote:

On 06/12/2013 15:34, Steven D'Aprano wrote:

(if I remember correctly) I think Mark also


sometimes posts a link to managing Google Groups.






You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython


That link needs updating.

Even if my almost-automatic correction methods are not considered
kosher for some reason or other, the thing that needs to go in there
is that GG has TWO problems

1. Blank lines
2. Long lines

That link only describes 1.

Roy's yesterday's post in "Packaging a proprietary python library"
says:


I, and Rusi, know enough, and take the effort, to overcome its
shortcomings doesn't change that.


But in fact his post takes care of 1 not 2.

In all fairness I did not know that 2 is a problem until rurpy pointed
it out recently and was not correcting it. In fact, I'd take the
trouble to make the lines long assuming that clients were intelligent
enough to fit it properly into whatever was the current window!!!

So someone please update that page!



This is a community so why don't you?

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance

2013-12-06 Thread Robert Voigtländer

Thanks for your replies.

I already did some basic profiling and optimized a lot. Especially with help of 
a goof python performance tips list I found.

I think I'll follow the cython path.
The geometry approach also sound good. But it's way above my math/geometry 
knowledge.

Thanks for your input!
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance


On 06/12/2013 16:29, Robert Voigtländer wrote:

Thanks for your replies.

I already did some basic profiling and optimized a lot. Especially  > with help 
of a goof python performance tips list I found.



Wonderful typo -^ :)


I think I'll follow the cython path.
The geometry approach also sound good. But it's way above my math/geometry 
knowledge.

Thanks for your input!




--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread MRAB


On 06/12/2013 15:34, Steven D'Aprano wrote:

On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote:


yes ,I am a native Chinese speaker.I always post question by Google
Group not through  email ,is there something wrong with it ? your
english is a little strange to me .


Mark is writing in fake old-English style, the way people think English
was spoken a thousand years ago. I don't know why he did that. Perhaps he
thought it was amusing.


[snip]
You're exaggerating. It's more like 500 years ago. :-)

--
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance

2013-12-06 Thread Robert Voigtländer

Am Freitag, 6. Dezember 2013 17:36:03 UTC+1 schrieb Mark Lawrence:

> > I already did some basic profiling and optimized a lot. Especially  > with 
> > help of a goof python performance tips list I found.
> 
> Wonderful typo -^ :)
> 

Oh well :-) ... it was a good one. Just had a quick look at Cython. Looks 
great. Thanks for the tip.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

On Friday, December 6, 2013 9:55:54 PM UTC+5:30, Mark Lawrence wrote:
> On 06/12/2013 16:19, rusi wrote:

> > So someone please update that page!

> This is a community so why don't you?

Ok done (at least a first draft)
I was under the impression that anyone could not edit
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?

2013-12-06 Thread Piotr Dobrogost

On Friday, December 6, 2013 3:07:51 PM UTC+1, Neil Cerutti wrote:
> On 2013-12-04, Piotr Dobrogost
> 
>  wrote:
> 
> > On Wednesday, December 4, 2013 10:41:49 PM UTC+1, Neil Cerutti
> > wrote:
> 
> >> not something to do commonly. Your proposed syntax leaves the
> >> distinction between valid and invalid identifiers a problem
> >> the programmer has to deal with. It doesn't unify access to
> 
> >> attributes the way the getattr and setattr do.
> 
> >
> 
> > Taking into account that obj.'x' would be equivalent to obj.x
> > any attribute can be accessed with the new syntax. I don't see
> > how this is not unified access compared to using getattr
> > instead dot...
> 
> I thought of that argument later the next day. Your proposal does
> unify access if the old obj.x syntax is removed.

As long as obj.x is a very concise way to get attribute named 'x' from object 
obj it's somehow odd that identifier x is treated not like identifier but like 
string literal 'x'. If it were treated like an identifier then we would get 
attribute with name being value of x instead attribute named 'x'. Making it 
possible to use string literals in the form obj.'x' as proposed this would make 
getattr basically needless as long as we use only variable not expression to 
denote attribute's name.
This is just casual remark.

Regards,
Piotr
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance

2013-12-06 Thread John Ladasky

On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:

> I try to squeeze out some performance of the code pasted on the link below.
> http://pastebin.com/gMnqprST

Several comments:

1) I find this program to be very difficult to read, largely because there's a 
whole LOT of duplicated code.  Look at lines 53-80, and lines 108-287, and 
lines 294-311.  It makes it harder to see what this algorithm actually does.  
Is there a way to refactor some of this code to use some shared function calls?

2) I looked up the "Bresenham algorithm", and found two references which may be 
relevant.  The original algorithm was one which computed good raster 
approximations to straight lines.  The second algorithm described may be more 
pertinent to you, because it draws arcs of circles.

http://en.wikipedia.org/wiki/Bresenham's_line_algorithm
http://en.wikipedia.org/wiki/Midpoint_circle_algorithm

Both of these algorithms are old, from the 1960's, and can be implemented using 
very simple CPU register operations and minimal memory.  Both of the web pages 
I referenced have extensive example code and pseudocode, and discuss 
optimization.  If you need speed, is this really a job for Python?

3) I THINK that I see some code -- those duplicated parts -- which might 
benefit from the use of multiprocessing (assuming that you have a multi-core 
CPU).  But I would have to read more deeply to be sure.  I need to understand 
the algorithm more completely, and exactly how you have modified it for your 
needs.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module

On Friday, December 6, 2013 10:11:04 PM UTC+5:30, MRAB wrote:
> On 06/12/2013 15:34, Steven D'Aprano wrote:
> > On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote:
> >> yes ,I am a native Chinese speaker.I always post question by Google
> >> Group not through  email ,is there something wrong with it ? your
> >> english is a little strange to me .

> > Mark is writing in fake old-English style, the way people think English
> > was spoken a thousand years ago. I don't know why he did that. Perhaps he
> > thought it was amusing.
> [snip]

> You're exaggerating. It's more like 500 years ago. :-)

I was going to say the same until I noticed the "the way people think English
was spoken..."

That makes it unarguable -- surely there are some people who (wrongly) think so?
-- 
https://mail.python.org/mailman/listinfo/python-list

interactive help on the base object


Is it just me, or is this basically useless?

>>> help(object)
Help on class object in module builtins:

class object
 |  The most base type

>>>

Surely a few more words, or a pointer to this 
http://docs.python.org/3/library/functions.html#object, would be better?


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Christian Gollwitzer


Am 06.12.13 14:12, schrieb Jean Dubois:

It works but it's not all clear to me. Can you tell me what "label.bind("<1>", quit)" 
is standing for? What's the <1> meaning?


"bind" connects events sent to the label with a handler. The <1> is the 
event description; in this case, it means a click with the left mouse 
button. The mouse buttons are numbered 1,2,3 for left,middle,right, 
respectively (with right and middle switched on OSX, confusingly). It is 
actually short for





Binding to the key "1" would look like this



The event syntax is rather complex, for example it is possible to add 
modifiers to bind to a Shift-key + right click like this




It is described in detail at the bind man page of Tk.

http://www.tcl.tk/man/tcl8.6/TkCmd/bind.htm

The event object passed to the handler contains additional information, 
for instance the position of the mouse pointer on the screen.


In practice, for large parts of the interface you do not mess with the 
keyboard and mouse events directly, but use the corresponding widgets.
In your program, the label works as a simple pushbutton, and therefore a 
button should be used.


#!/usr/bin/env python
import Tkinter as tk
import ttk # for modern widgets
import sys

# no underscore - nothing gets passed
def quit():
sys.exit()

root = tk.Tk()
button = ttk.Button(root, text="Click mouse here to quit", command=quit)
button.pack()
root.mainloop()


note, that

1) nothing gets passed, so we could have left out changing quit(). This 
is because a button comand usually does not care about details of the 
mouse click. It just reacts as the user expects.


2) I use ttk widgets, which provide native look&feel. If possible, use 
those. Good examples on ttk usage are shown at 
http://www.tkdocs.com/tutorial/index.html


HTH,
Christia
--
https://mail.python.org/mailman/listinfo/python-list

Does Python optimize low-power functions?

2013-12-06 Thread John Ladasky

The following two functions return the same result:

x**2
x*x

But they may be computed in different ways.  The first choice can accommodate 
non-integer powers and so it would logically proceed by taking a logarithm, 
multiplying by the power (in this case, 2), and then taking the anti-logarithm. 
 But for a trivial value for the power like 2, this is clearly a wasteful 
choice.  Just multiply x by itself, and skip the expensive log and anti-log 
steps.

My question is, what do Python interpreters do with power operators where the 
power is a small constant, like 2?  Do they know to take the shortcut?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Does Python optimize low-power functions?

2013-12-06 Thread Jean-Michel Pichavant

- Original Message -
> The following two functions return the same result:
> 
> x**2
> x*x
> 
> But they may be computed in different ways.  The first choice can
> accommodate non-integer powers and so it would logically proceed by
> taking a logarithm, multiplying by the power (in this case, 2), and
> then taking the anti-logarithm.  But for a trivial value for the
> power like 2, this is clearly a wasteful choice.  Just multiply x by
> itself, and skip the expensive log and anti-log steps.
> 
> My question is, what do Python interpreters do with power operators
> where the power is a small constant, like 2?  Do they know to take
> the shortcut?
> --
> https://mail.python.org/mailman/listinfo/python-list

It is probably specific to the interpreter implementation(cython, jython, iron 
python etc...). You'd better optimize it yourself should you really care about 
this.
An alternative is to use numpy functions, like numpy.power, they are optimized 
version of most mathematical functions.

JM


-- IMPORTANT NOTICE: 

The contents of this email and any attachments are confidential and may also be 
privileged. If you are not the intended recipient, please notify the sender 
immediately and do not disclose the contents to any other person, use it for 
any purpose, or store or copy the information in any medium. Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Does Python optimize low-power functions?

2013-12-06 Thread Neil Cerutti

On 2013-12-06, John Ladasky  wrote:
> The following two functions return the same result:
>
> x**2
> x*x
>
> But they may be computed in different ways.  The first choice
> can accommodate non-integer powers and so it would logically
> proceed by taking a logarithm, multiplying by the power (in
> this case, 2), and then taking the anti-logarithm.  But for a
> trivial value for the power like 2, this is clearly a wasteful
> choice.  Just multiply x by itself, and skip the expensive log
> and anti-log steps.
> 
> My question is, what do Python interpreters do with power
> operators where the power is a small constant, like 2?  Do they
> know to take the shortcut?

It uses a couple of fast algorithms for computing powers. Here's
the excerpt with the comments identifying the algorithms used.
>From longobject.c:

2873 if (Py_SIZE(b) <= FIVEARY_CUTOFF) {
2874 /* Left-to-right binary exponentiation (HAC Algorithm 14.79) */
2875 /* http://www.cacr.math.uwaterloo.ca/hac/about/chap14.pdf*/
...
2886 else {
2887 /* Left-to-right 5-ary exponentiation (HAC Algorithm 14.82) */

The only outright optimization of the style I think your
describing that I can see is it quickly returns zero when modulus
is one.

I'm not a skilled or experienced CPython source reader, though.

-- 
Neil Cerutti

-- 
https://mail.python.org/mailman/listinfo/python-list

ASCII and Unicode [was Re: Managing Google Groups headaches]

On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote:

> Evidently (and completely inadvertently) this exchange has just
> illustrated one of the inadmissable assumptions:
> 
> "unicode as a medium is universal in the same way that ASCII used to be"

Ironically, your post was not Unicode.

Seriously. I am 100% serious.

Your post was sent using a legacy encoding, Windows-1252, also known as 
CP-1252, which is most certainly *not* Unicode. Whatever software you 
used to send the message correctly flagged it with a charset header:

Content-Type: text/plain; charset=windows-1252

Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle 
encodings correctly (or at all!), it screws up the encoding then sends a 
reply with no charset line at all. This is one bug that cannot be blamed 
on Google Groups -- or on Unicode.

> I wrote a number of ellipsis characters ie codepoint 2026 as in:

Actually you didn't. You wrote a number of ellipsis characters, hex byte 
\x85 (decimal 133), in the CP1252 charset. That happens to be mapped to 
code point U+2026 in Unicode, but the two are as distinct as ASCII and 
EBCDIC.

> Somewhere between my sending and your quoting those ellipses became the
> replacement character FFFD

Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about 
encodings and character sets. It doesn't just assume things are ASCII, 
but makes a half-hearted attempt to be charset-aware, but badly. I can 
only imagine that it was written back in the Dark Ages where there were a 
lot of different charsets in use but no conventions for specifying which 
charset was in use. Or perhaps the author was smoking crack while coding.

> Leaving aside whose fault this is (very likely buggy google groups),
> this mojibaking cannot happen if the assumption "All text is ASCII" were
> to uniformly hold.

This is incorrect. People forget that ASCII has evolved since the first 
version of the standard in 1963. There have actually been five versions 
of the ASCII standard, plus one unpublished version. (And that's not 
including the things which are frequently called ASCII but aren't.)

ASCII-1963 didn't even include lowercase letters. It is also missing some 
graphic characters like braces, and included at least two characters no 
longer used, the up-arrow and left-arrow. The control characters were 
also significantly different from today.

ASCII-1965 was unpublished and unused. I don't know the details of what 
it changed.

ASCII-1967 is a lot closer to the ASCII in use today. It made 
considerable changes to the control characters, moving, adding, removing, 
or renaming at least half a dozen control characters. It officially added 
lowercase letters, braces, and some others. It replaced the up-arrow 
character with the caret and the left-arrow with the underscore. It was 
ambiguous, allowing variations and substitutions, e.g.:

- character 33 was permitted to be either the exclamation 
  mark ! or the logical OR symbol |

- consequently character 124 (vertical bar) was always 
  displayed as a broken bar ¦, which explains why even today
  many keyboards show it that way

- character 35 was permitted to be either the number sign # or 
  the pound sign £

- character 94 could be either a caret ^ or a logical NOT ¬

Even the humble comma could be pressed into service as a cedilla.

ASCII-1968 didn't change any characters, but allowed the use of LF on its 
own. Previously, you had to use either LF/CR or CR/LF as newline.

ASCII-1977 removed the ambiguities from the 1967 standard.

The most recent version is ASCII-1986 (also known as ANSI X3.4-1986). 
Unfortunately I haven't been able to find out what changes were made -- I 
presume they were minor, and didn't affect the character set.

So as you can see, even with actual ASCII, you can have mojibake. It's 
just not normally called that. But if you are given an arbitrary ASCII 
file of unknown age, containing code 94, how can you be sure it was 
intended as a caret rather than a logical NOT symbol? You can't.

Then there are at least 30 official variations of ASCII, strictly 
speaking part of ISO-646. These 7-bit codes were commonly called "ASCII" 
by their users, despite the differences, e.g. replacing the dollar sign $ 
with the international currency sign ¤, or replacing the left brace 
{ with the letter s with caron š.

One consequence of this is that the MIME type for ASCII text is called 
"US ASCII", despite the redundancy, because many people expect "ASCII" 
alone to mean whatever national variation they are used to.

But it gets worse: there are proprietary variations on ASCII which are 
commonly called "ASCII" but aren't, including dozens of 8-bit so-called 
"extended ASCII" character sets, which is where the problems *really* 
pile up. Invariably back in the 1980s and early 1990s people used to call 
these "ASCII" no matter that they used 8-bits and contained anything up 
to 256 characters.

Just because somebody

Re: Does Python optimize low-power functions?

2013-12-06 Thread Robert Kern


On 2013-12-06 19:01, Neil Cerutti wrote:

On 2013-12-06, John Ladasky  wrote:

The following two functions return the same result:

 x**2
 x*x

But they may be computed in different ways.  The first choice
can accommodate non-integer powers and so it would logically
proceed by taking a logarithm, multiplying by the power (in
this case, 2), and then taking the anti-logarithm.  But for a
trivial value for the power like 2, this is clearly a wasteful
choice.  Just multiply x by itself, and skip the expensive log
and anti-log steps.

My question is, what do Python interpreters do with power
operators where the power is a small constant, like 2?  Do they
know to take the shortcut?


It uses a couple of fast algorithms for computing powers. Here's
the excerpt with the comments identifying the algorithms used.
 From longobject.c:

2873 if (Py_SIZE(b) <= FIVEARY_CUTOFF) {
2874 /* Left-to-right binary exponentiation (HAC Algorithm 14.79) */
2875 /* http://www.cacr.math.uwaterloo.ca/hac/about/chap14.pdf*/
...
2886 else {
2887 /* Left-to-right 5-ary exponentiation (HAC Algorithm 14.82) */


It's worth noting that the *interpreter* per se is not doing this. The 
implementation of the `long` object does this in its implementation of the 
`__pow__` method, which the interpreter invokes. Other objects may implement 
this differently and use whatever optimizations they like. They may even (ab)use 
the syntax for things other than numerical exponentiation where `x**2` is not 
equivalent to `x*x`. Since objects are free to do so, the interpreter itself 
cannot choose to optimize that exponentiation down to multiplication.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list

RE: Does Python optimize low-power functions?

2013-12-06 Thread Nick Cash

>My question is, what do Python interpreters do with power operators where the 
>power is a small constant, like 2?  Do they know to take the shortcut?

Nope:

Python 3.3.0 (default, Sep 25 2013, 19:28:08) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis(lambda x: x*x)
  1   0 LOAD_FAST0 (x) 
  3 LOAD_FAST0 (x) 
  6 BINARY_MULTIPLY  
  7 RETURN_VALUE 
>>> dis.dis(lambda x: x**2)
  1   0 LOAD_FAST0 (x) 
  3 LOAD_CONST   1 (2) 
  6 BINARY_POWER 
  7 RETURN_VALUE 


The reasons why have already been answered, I just wanted to point out that 
Python makes it extremely easy to check these sorts of things for yourself.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread Gene Heskett

On Friday 06 December 2013 14:30:06 Steven D'Aprano did opine:

> On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote:
> > Evidently (and completely inadvertently) this exchange has just
> > illustrated one of the inadmissable assumptions:
> > 
> > "unicode as a medium is universal in the same way that ASCII used to
> > be"
> 
> Ironically, your post was not Unicode.
> 
> Seriously. I am 100% serious.
> 
> Your post was sent using a legacy encoding, Windows-1252, also known as
> CP-1252, which is most certainly *not* Unicode. Whatever software you
> used to send the message correctly flagged it with a charset header:
> 
> Content-Type: text/plain; charset=windows-1252
> 
> Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle
> encodings correctly (or at all!), it screws up the encoding then sends a
> reply with no charset line at all. This is one bug that cannot be blamed
> on Google Groups -- or on Unicode.
> 
> > I wrote a number of ellipsis characters ie codepoint 2026 as in:
> Actually you didn't. You wrote a number of ellipsis characters, hex byte
> \x85 (decimal 133), in the CP1252 charset. That happens to be mapped to
> code point U+2026 in Unicode, but the two are as distinct as ASCII and
> EBCDIC.
> 
> > Somewhere between my sending and your quoting those ellipses became
> > the replacement character FFFD
> 
> Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about
> encodings and character sets. It doesn't just assume things are ASCII,
> but makes a half-hearted attempt to be charset-aware, but badly. I can
> only imagine that it was written back in the Dark Ages where there were
> a lot of different charsets in use but no conventions for specifying
> which charset was in use. Or perhaps the author was smoking crack while
> coding.
> 
> > Leaving aside whose fault this is (very likely buggy google groups),
> > this mojibaking cannot happen if the assumption "All text is ASCII"
> > were to uniformly hold.
> 
> This is incorrect. People forget that ASCII has evolved since the first
> version of the standard in 1963. There have actually been five versions
> of the ASCII standard, plus one unpublished version. (And that's not
> including the things which are frequently called ASCII but aren't.)
> 
> ASCII-1963 didn't even include lowercase letters. It is also missing
> some graphic characters like braces, and included at least two
> characters no longer used, the up-arrow and left-arrow. The control
> characters were also significantly different from today.
> 
> ASCII-1965 was unpublished and unused. I don't know the details of what
> it changed.
> 
> ASCII-1967 is a lot closer to the ASCII in use today. It made
> considerable changes to the control characters, moving, adding,
> removing, or renaming at least half a dozen control characters. It
> officially added lowercase letters, braces, and some others. It
> replaced the up-arrow character with the caret and the left-arrow with
> the underscore. It was ambiguous, allowing variations and
> substitutions, e.g.:
> 
> - character 33 was permitted to be either the exclamation
>   mark ! or the logical OR symbol |
> 
> - consequently character 124 (vertical bar) was always
>   displayed as a broken bar آ¦, which explains why even today
>   many keyboards show it that way
> 
> - character 35 was permitted to be either the number sign # or
>   the pound sign آ£
> 
> - character 94 could be either a caret ^ or a logical NOT آ¬
> 
> Even the humble comma could be pressed into service as a cedilla.
> 
> ASCII-1968 didn't change any characters, but allowed the use of LF on
> its own. Previously, you had to use either LF/CR or CR/LF as newline.
> 
> ASCII-1977 removed the ambiguities from the 1967 standard.
> 
> The most recent version is ASCII-1986 (also known as ANSI X3.4-1986).
> Unfortunately I haven't been able to find out what changes were made --
> I presume they were minor, and didn't affect the character set.
> 
> So as you can see, even with actual ASCII, you can have mojibake. It's
> just not normally called that. But if you are given an arbitrary ASCII
> file of unknown age, containing code 94, how can you be sure it was
> intended as a caret rather than a logical NOT symbol? You can't.
> 
> Then there are at least 30 official variations of ASCII, strictly
> speaking part of ISO-646. These 7-bit codes were commonly called "ASCII"
> by their users, despite the differences, e.g. replacing the dollar sign
> $ with the international currency sign آ¤, or replacing the left brace
> { with the letter s with caron إ،.
> 
> One consequence of this is that the MIME type for ASCII text is called
> "US ASCII", despite the redundancy, because many people expect "ASCII"
> alone to mean whatever national variation they are used to.
> 
> But it gets worse: there are proprietary variations on ASCII which are
> commonly called "ASCII" but aren't, including dozens of 8-bit so-called
> "extended ASCII" character

Re: Does Python optimize low-power functions?

2013-12-06 Thread John Ladasky

On Friday, December 6, 2013 11:32:00 AM UTC-8, Nick Cash wrote:

> The reasons why have already been answered, I just wanted to point out that 
> Python makes it extremely easy to check these sorts of things for yourself.

Thanks for the heads-up on the dis module, Nick.  I haven't played with that 
one yet.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Does Python optimize low-power functions?

2013-12-06 Thread Oscar Benjamin

On 6 December 2013 18:16, John Ladasky  wrote:
> The following two functions return the same result:
>
> x**2
> x*x
>
> But they may be computed in different ways.  The first choice can accommodate 
> non-integer powers and so it would logically proceed by taking a logarithm, 
> multiplying by the power (in this case, 2), and then taking the 
> anti-logarithm.  But for a trivial value for the power like 2, this is 
> clearly a wasteful choice.  Just multiply x by itself, and skip the expensive 
> log and anti-log steps.
>
> My question is, what do Python interpreters do with power operators where the 
> power is a small constant, like 2?  Do they know to take the shortcut?

As mentioned this will depend on the interpreter and on the type of x.
Python's integer arithmetic is exact and unbounded so switching to
floating point and using approximate logarithms is a no go if x is an
int object.

For CPython specifically, you can see here:
http://hg.python.org/cpython/file/07ef52e751f3/Objects/floatobject.c#l741
that for floats x**2 will be equivalent to x**2.0 and will be handled
by the pow function from the underlying C math library. If you read
the comments around that line you'll see that different inconsistent
math libraries can do things very differently leading to all kinds of
different problems.

For CPython if x is an int (long) then as mentioned before it is
handled by the HAC algorithm:
http://hg.python.org/cpython/file/07ef52e751f3/Objects/longobject.c#l3934

For CPython if x is a complex then it is handled roughly as you say:
for x**n if n is between -100 and 100 then multiplication is performed
using the "bit-mask exponentiation" algorithm. Otherwise it is
computed by converting to polar exponential form and using logs (see
also the two functions above this one):
http://hg.python.org/cpython/file/07ef52e751f3/Objects/complexobject.c#l151

Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread Roy Smith

Steven D'Aprano  pearwood.info> writes:

> Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about 
> encodings and character sets. It doesn't just assume things are ASCII, 
> but makes a half-hearted attempt to be charset-aware, but badly. I can 
> only imagine that it was written back in the Dark Ages

Indeed.  The basic codebase probably goes back 20 years.  I'm posting this
from gmane, just so people don't think I'm a total luddite.

> When transmitting ASCII characters, the networking protocol could include 
> various start and stop bits and parity codes. A single 7-bit ASCII 
> character might be anything up to 12 bits in length on the wire.

Not to mention that some really old hardware used 1.5 stop bits!


-- 
https://mail.python.org/mailman/listinfo/python-list

Python 2.8 release schedule

My apologies if you've seen this before but here is the official 
schedule http://www.python.org/dev/peps/pep-0404/


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters


Garthy wrote:
To allow each script to run in its own environment, with minimal chance 
of inadvertent interaction between the environments, whilst allowing 
each script the ability to stall on conditions that will be later met by 
another thread supplying the information, and to fit in with existing 
infrastructure.


The last time I remember this being discussed was in the context
of allowing free threading. Multiple interpreters don't solve
that problem, because there's still only one GIL and some
objects are shared.

But if all you want is for each plugin to have its own version
of sys.modules, etc., and you're not concerned about malicious
code, then it may be good enough.

It seems to be good enough for mod_wsgi, because presumably
all the people with the ability to install code on a given
web server trust each other.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance

On Fri, Dec 6, 2013 at 11:52 AM, John Ladasky wrote:

> On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:
>
> > I try to squeeze out some performance of the code pasted on the link
> below.
> > http://pastebin.com/gMnqprST
>

Not that this will speed up your code but you have this:

if not clockwise:
s = start
start = end
end = s

Python people would write:
end, start = start, end


You have quite a few if statements that involve multiple comparisons of the
same variable.  Did you know you can do things like this in python:

>>> x = 4
>>> 2 < x < 7
True
>>> x = 55
>>> 2 < x < 7
False


> Several comments:
>
> 1) I find this program to be very difficult to read, largely because
> there's a whole LOT of duplicated code.  Look at lines 53-80, and lines
> 108-287, and lines 294-311.  It makes it harder to see what this algorithm
> actually does.  Is there a way to refactor some of this code to use some
> shared function calls?
>
> 2) I looked up the "Bresenham algorithm", and found two references which
> may be relevant.  The original algorithm was one which computed good raster
> approximations to straight lines.  The second algorithm described may be
> more pertinent to you, because it draws arcs of circles.
>
> http://en.wikipedia.org/wiki/Bresenham's_line_algorithm
> http://en.wikipedia.org/wiki/Midpoint_circle_algorithm
>
> Both of these algorithms are old, from the 1960's, and can be implemented
> using very simple CPU register operations and minimal memory.  Both of the
> web pages I referenced have extensive example code and pseudocode, and
> discuss optimization.  If you need speed, is this really a job for Python?
>
> 3) I THINK that I see some code -- those duplicated parts -- which might
> benefit from the use of multiprocessing (assuming that you have a
> multi-core CPU).  But I would have to read more deeply to be sure.  I need
> to understand the algorithm more completely, and exactly how you have
> modified it for your needs.
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters


Garthy wrote:

The bare minimum would be 
protection against inadvertent interaction. Better yet would be a setup 
that made such interaction annoyingly difficult, and the ideal would be 
where it was impossible to interfere.


To give you an idea of the kind of interference that's
possible, consider:

1) You can find all the subclasses of a given class
object using its __subclasses__() method.

2) Every class ultimately derives from class object.

3) All built-in class objects are shared between
interpreters.

So, starting from object.__subclasses__(), code in any
interpreter could find any class defined by any other
interpreter and mutate it.

This is not something that is likely to happen by
accident. Whether it's "annoyingly difficult" enough
is something you'll have to decide.

Also keep in mind that it's fairly easy for Python
code to chew up large amounts of memory and/or CPU
time in an uninterruptible way, e.g. by
evaluating 5**1. So even a thread that's
keeping its hands entirely to itself can still
cause trouble.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance


On 06/12/2013 16:52, John Ladasky wrote:

On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:


I try to squeeze out some performance of the code pasted on the link below.
http://pastebin.com/gMnqprST


Several comments:

1) I find this program to be very difficult to read, largely because there's a 
whole LOT of duplicated code.  Look at lines 53-80, and lines 108-287, and 
lines 294-311.  It makes it harder to see what this algorithm actually does.  
Is there a way to refactor some of this code to use some shared function calls?



A handy tool for detecting duplicated code here 
http://clonedigger.sourceforge.net/ for anyone who's interested.


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: squeeze out some performance

2013-12-06 Thread Dan Stromberg

On Fri, Dec 6, 2013 at 2:38 PM, Mark Lawrence wrote:

> On 06/12/2013 16:52, John Ladasky wrote:
>
>> On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:
>>
>>  I try to squeeze out some performance of the code pasted on the link
>>> below.
>>> http://pastebin.com/gMnqprST
>>>
>>
>> Several comments:
>>
>> 1) I find this program to be very difficult to read, largely because
>> there's a whole LOT of duplicated code.  Look at lines 53-80, and lines
>> 108-287, and lines 294-311.  It makes it harder to see what this algorithm
>> actually does.  Is there a way to refactor some of this code to use some
>> shared function calls?
>>
>>
> A handy tool for detecting duplicated code here
> http://clonedigger.sourceforge.net/ for anyone who's interested.
>

Pylint does this too...
-- 
https://mail.python.org/mailman/listinfo/python-list

Eliminate "extra" variable

2013-12-06 Thread Igor Korot

Hi, ALL,
I have following code:

def MyFunc(self, originalData):
 data = {}
 dateStrs = []
 for i in xrange(0, len(originalData)):
   dateStr, freq, source = originalData[i]
   data[str(dateStr)]  = {source: freq}
   dateStrs.append(dateStr)
for i in xrange(0, len(dateStrs) - 1):
  currDateStr = str(dateStrs[i])
  nextDateStrs = str(dateStrs[i + 1])


It seems very strange that I need the dateStrs list just for the
purpose of looping thru the dictionary keys.
Can I get rid of the "dateStrs" variable?

Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches


rusi wrote:

On Friday, December 6, 2013 1:06:30 PM UTC+5:30, Roy Smith wrote:

Which means, if I wanted to (and many examples of this exist), I can 
write my own client which presents the same information in different 
ways.


Not sure whats your point.


The point is the existence of an alternative interface that's
designed for use by other programs rather than humans.

This is what web forums are missing. If it existed, one could
easily create an alternative client with a newsreader-like
interface. Without it, such a client would have to be a
monstrosity that worked by screen-scraping the html.

It's not about the format of the messages themselves -- that
could be text, or html, or reST, or bbcode or whatever. It's
about the *framing* of the messages, and being able to
query them by their metadata.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Eliminate "extra" variable

2013-12-06 Thread Gary Herron


On 12/06/2013 11:37 AM, Igor Korot wrote:

Hi, ALL,
I have following code:

def MyFunc(self, originalData):
  data = {}
  dateStrs = []
  for i in xrange(0, len(originalData)):
dateStr, freq, source = originalData[i]
data[str(dateStr)]  = {source: freq}
dateStrs.append(dateStr)
 for i in xrange(0, len(dateStrs) - 1):
   currDateStr = str(dateStrs[i])
   nextDateStrs = str(dateStrs[i + 1])


It seems very strange that I need the dateStrs list just for the
purpose of looping thru the dictionary keys.
Can I get rid of the "dateStrs" variable?

Thank you.


You want to build a list, but you don't want to give that list a name?  
Why not?  And how would you refer to that list in the second loop if it 
didn't have a name?


And concerning that second loop:  What are you trying to do there? It 
looks like a complete waste of time.  In fact, with what you've shown 
us, you can eliminate the variable dateStrs, and both loops and be no 
worse off.


Perhaps there is more to your code than you've shown to us ...

Gary Herron

--
https://mail.python.org/mailman/listinfo/python-list

Re: Eliminate "extra" variable

On Fri, Dec 6, 2013 at 2:37 PM, Igor Korot  wrote:

> Hi, ALL,
> I have following code:
>
> def MyFunc(self, originalData):
>  data = {}
>  dateStrs = []
>  for i in xrange(0, len(originalData)):
>dateStr, freq, source = originalData[i]
>data[str(dateStr)]  = {source: freq}
>
   # above line confuses me!


>dateStrs.append(dateStr)
> for i in xrange(0, len(dateStrs) - 1):
>   currDateStr = str(dateStrs[i])
>   nextDateStrs = str(dateStrs[i + 1])
>
>
Python lets you iterate over a list directly, so :

for d in originalData:
dateStr, freq, source = d
data[source] = freq

Your code looks like you come from a c background.  Python idioms are
different

I'm not sure what you are trying to do in the second for loop, but I think
you are trying to iterate thru a dictionary in a certain order, and you
can't depend on the order

>
> It seems very strange that I need the dateStrs list just for the
> purpose of looping thru the dictionary keys.
> Can I get rid of the "dateStrs" variable?
>
> Thank you.
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

On Sat, Dec 7, 2013 at 6:00 AM, Steven D'Aprano
 wrote:
> - character 33 was permitted to be either the exclamation
>   mark ! or the logical OR symbol |
>
> - consequently character 124 (vertical bar) was always
>   displayed as a broken bar ¦, which explains why even today
>   many keyboards show it that way
>
> - character 35 was permitted to be either the number sign # or
>   the pound sign £
>
> - character 94 could be either a caret ^ or a logical NOT ¬

Yeah, good fun stuff. I first met several of these ambiguities in the
OS/2 REXX documentation, which detailed the language's operators by
specifying their byte values as well as their characters - for
instance, this quote from the docs (yeah, I still have it all here):

"""
Note:   Depending upon your Personal System keyboard and the code page
you are using, you may not have the solid vertical bar to select. For
this reason, REXX also recognizes the use of the split vertical bar as
a logical OR symbol. Some keyboards may have both characters. If so,
they are not interchangeable; only the character that is equal to the
ASCII value of 124 works as the logical OR. This type of mismatch can
also cause the character on your screen to be different from the
character on your keyboard.
"""
(The front material on the docs says "(C) Copyright IBM Corp. 1987,
1994. All Rights Reserved.")

It says "ASCII value" where on this list we would be more likely to
call it "byte value", and I'd prefer to say "represented by" rather
than "equal to", but nonetheless, this is still clearly distinguishing
characters and bytes. The language spec is on characters, but
ultimately the interpreter is going to be looking at bytes, so when
there's a problem, it's byte 124 that's the one defined as logical OR.
Oh, and note the copyright date. The byte/char distinction isn't new.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

One liners

2013-12-06 Thread Dan Stromberg

Does anyone else feel like Python is being dragged too far in the direction
of long, complex, multiline one-liners?  Or avoiding temporary variables
with descriptive names?  Or using regex's for everything under the sun?

What happened to using classes?  What happened to the beautiful emphasis on
readability?  What happened to debuggability (which is always harder than
writing things in the first place)?  And what happened to string methods?

I'm pleased to see Python getting more popular, but it feels like a lot of
newcomers are trying their best to turn Python into Perl or something,
culturally speaking.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: using ffmpeg command line with python's subprocess module


rusi wrote:

On Friday, December 6, 2013 10:11:04 PM UTC+5:30, MRAB wrote:


You're exaggerating. It's more like 500 years ago. :-)


I was going to say the same until I noticed the "the way people think English
was spoken..."

That makes it unarguable -- surely there are some people who (wrongly) think so?


Probably. They're surprisingly far off, though. Here's
a sample of actual 1000-year-old English:

http://answers.yahoo.com/question/index?qid=20100314001840AAygUaq

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: One liners

2013-12-06 Thread Ned Batchelder


On 12/6/13 6:54 PM, Dan Stromberg wrote:


Does anyone else feel like Python is being dragged too far in the
direction of long, complex, multiline one-liners?  Or avoiding temporary
variables with descriptive names?  Or using regex's for everything under
the sun?

What happened to using classes?  What happened to the beautiful emphasis
on readability?  What happened to debuggability (which is always harder
than writing things in the first place)?  And what happened to string
methods?

I'm pleased to see Python getting more popular, but it feels like a lot
of newcomers are trying their best to turn Python into Perl or
something, culturally speaking.


I agree with you that those trends would be bad.  But I'm not sure how 
you are judging that "Python" is being dragged in that direction?  It's 
a huge community.  Sure some people are obsessed with fewer lines, and 
micro-optimizations, and other newb mistakes, but there are good people too!


--Ned, ever the optimist.


--
https://mail.python.org/mailman/listinfo/python-list

Re: One liners

2013-12-06 Thread Michael Torrie

On 12/06/2013 04:54 PM, Dan Stromberg wrote:
> Does anyone else feel like Python is being dragged too far in the direction
> of long, complex, multiline one-liners?  Or avoiding temporary variables
> with descriptive names?  Or using regex's for everything under the sun?
> 
> What happened to using classes?  What happened to the beautiful emphasis on
> readability?  What happened to debuggability (which is always harder than
> writing things in the first place)?  And what happened to string methods?
> 
> I'm pleased to see Python getting more popular, but it feels like a lot of
> newcomers are trying their best to turn Python into Perl or something,
> culturally speaking.

I have not seen any evidence that this trend of yours is widespread.
The Python code I come across seems pretty normal to me.  Expressive and
readable.  Haven't seen any attempt to turn Python into Perl or that
sort of thing.  And I don't see that culture expressed on the list.
Maybe I'm just blind...


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: One liners

2013-12-06 Thread Dan Stromberg

On Fri, Dec 6, 2013 at 4:10 PM, Michael Torrie  wrote:

> On 12/06/2013 04:54 PM, Dan Stromberg wrote:
> > Does anyone else feel like Python is being dragged too far in the
> direction
> > of long, complex, multiline one-liners?  Or avoiding temporary variables
> > with descriptive names?  Or using regex's for everything under the sun?
> >
> > What happened to using classes?  What happened to the beautiful emphasis
> on
> > readability?  What happened to debuggability (which is always harder than
> > writing things in the first place)?  And what happened to string methods?
> >
> > I'm pleased to see Python getting more popular, but it feels like a lot
> of
> > newcomers are trying their best to turn Python into Perl or something,
> > culturally speaking.
>
> I have not seen any evidence that this trend of yours is widespread.
> The Python code I come across seems pretty normal to me.  Expressive and
> readable.  Haven't seen any attempt to turn Python into Perl or that
> sort of thing.  And I don't see that culture expressed on the list.
> Maybe I'm just blind...


I'm thinking mostly of stackoverflow, but here's an example I ran into (a
lot of) on a job:

somevar = some_complicated_thing(somevar) if
some_other_complicated_thing(somevar) else somevar

Would it really be so bad to just use an if statement?  Why are we
assigning somevar to itself?  This sort of thing was strewn across 3 or 4
physical lines at a time.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Eliminate "extra" variable

2013-12-06 Thread Roy Smith

In article ,
 Joel Goldstick  wrote:

> Python lets you iterate over a list directly, so :
> 
> for d in originalData:
> dateStr, freq, source = d
> data[source] = freq

I would make it even simpler:

> for dateStr, freq, source in originalData:
> data[source] = freq
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: One liners

2013-12-06 Thread Michael Torrie

On 12/06/2013 05:14 PM, Dan Stromberg wrote:
> I'm thinking mostly of stackoverflow, but here's an example I ran into (a
> lot of) on a job:
> 
> somevar = some_complicated_thing(somevar) if
> some_other_complicated_thing(somevar) else somevar
> 
> Would it really be so bad to just use an if statement?  Why are we
> assigning somevar to itself?  This sort of thing was strewn across 3 or 4
> physical lines at a time.

You're right that a conventional "if" block is not only more readable,
but also faster and more efficient code.  Sorry you have to deal with
code written like that!  That'd frustrate any sane programmer.  It might
bother me enough to write code to reformat the program to convert that
style to something sane!  There are times when the ternary (did I get
that right?) operator is useful and clear.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

Hi Gregory,

On 07/12/13 08:53, Gregory Ewing wrote:
> Garthy wrote:
>> The bare minimum would be protection against inadvertent interaction.
>> Better yet would be a setup that made such interaction annoyingly
>> difficult, and the ideal would be where it was impossible to interfere.
>
> To give you an idea of the kind of interference that's
> possible, consider:
>
> 1) You can find all the subclasses of a given class
> object using its __subclasses__() method.
>
> 2) Every class ultimately derives from class object.
>
> 3) All built-in class objects are shared between
> interpreters.
>
> So, starting from object.__subclasses__(), code in any
> interpreter could find any class defined by any other
> interpreter and mutate it.

Many thanks for the excellent example. It was not clear to me how 
readily such a small and critical bit of shared state could potentially 
be abused across interpreter boundaries. I am guessing this would be the 
first in a chain of potential problems I may run into.

> This is not something that is likely to happen by
> accident. Whether it's "annoyingly difficult" enough
> is something you'll have to decide.

I think it'd fall under "protection against inadvertent modification"- 
down the scale somewhat. It doesn't sound like it would be too difficult 
to achieve if the author was so inclined.

> Also keep in mind that it's fairly easy for Python
> code to chew up large amounts of memory and/or CPU
> time in an uninterruptible way, e.g. by
> evaluating 5**1. So even a thread that's
> keeping its hands entirely to itself can still
> cause trouble.

Thanks for the tip. The potential for deliberate resource exhaustion is 
unfortunately something that I am likely going to have to put up with in 
order to keep things in the same process.

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list

Re: Eliminate "extra" variable

On Fri, Dec 6, 2013 at 7:16 PM, Roy Smith  wrote:

> In article ,
>  Joel Goldstick  wrote:
>
> > Python lets you iterate over a list directly, so :
> >
> > for d in originalData:
> > dateStr, freq, source = d
> > data[source] = freq
>
> I would make it even simpler:
>
> > for dateStr, freq, source in originalData:
> > data[source] = freq
>

+1 --- I agree

To the OP:

Could you add a docstring to your function to explain what is supposed to
happen, describe the input and output?  If you do that I'm sure you could
get some more complete help with your code.

> --
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Eliminate "extra" variable

2013-12-06 Thread Tim Chase

On 2013-12-06 11:37, Igor Korot wrote:
> def MyFunc(self, originalData):
>  data = {}
>  for i in xrange(0, len(originalData)):
>dateStr, freq, source = originalData[i]
>data[str(dateStr)]  = {source: freq}

this can be more cleanly/pythonically written as

  def my_func(self, original_data):
for date, freq, source in original_data
  data[str(date)] = {source: freq}

or even just

data = dict(
  (str(date), {source: freq})
  for date, freq, source in original_data
  )

You're calling it a "dateStr", which suggests that it's already a
string, so I'm not sure why you're str()'ing it.  So I'd either just
call it "date", or skip the str(date) bit if it's already a string.
That said, do you even need to convert it to a string (as
datetime.date objects can be used as keys in dictionaries)?

> for i in xrange(0, len(dateStrs) - 1):
>   currDateStr = str(dateStrs[i])
>   nextDateStrs = str(dateStrs[i + 1])
> 
> It seems very strange that I need the dateStrs list just for the
> purpose of looping thru the dictionary keys.
> Can I get rid of the "dateStrs" variable?

Your code isn't actually using the data-dict at this point.  If you
were doing something with it, it might help to know what you want to
do.

Well, you can iterate over the original data, zipping them together:

  for (cur, _, _), (next, _, _) in zip(
  original_data[:-1],
  original_data[1:]
  ):
do_something(cur, next)

If your purpose for the "data" dict is to merely look up stats from
the next one, the whole batch of your original code can be replaced
with:

  for (
(cur_dt, cur_freq, cur_source),
(next_dt, next_freq, next_source)
) in zip(original_data[:-1], original_data[1:]):
# might need to do str(cur_dt) and str(next_dt) instead?
do_things_with(cur_dt, cur_freq, cur_source,
  next_dt, next_freq, next_source)

That eliminates the dict *and* the extra variable name. :-)

-tkc

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: One liners

On Fri, Dec 6, 2013 at 7:20 PM, Michael Torrie  wrote:

> On 12/06/2013 05:14 PM, Dan Stromberg wrote:
> > I'm thinking mostly of stackoverflow, but here's an example I ran into (a
> > lot of) on a job:
> >
> > somevar = some_complicated_thing(somevar) if
> > some_other_complicated_thing(somevar) else somevar
> >
> > Would it really be so bad to just use an if statement?  Why are we
> > assigning somevar to itself?  This sort of thing was strewn across 3 or 4
> > physical lines at a time.
>
> You're right that a conventional "if" block is not only more readable,
> but also faster and more efficient code.  Sorry you have to deal with
> code written like that!  That'd frustrate any sane programmer.  It might
> bother me enough to write code to reformat the program to convert that
> style to something sane!  There are times when the ternary (did I get
> that right?) operator is useful and clear.
> --
> https://mail.python.org/mailman/listinfo/python-list
>

While it seems to be a higher status in the team to write new code as
compared to fixing old code, so much can be learned by having to plough
through old code.  To learn others coding style, pick up new understanding,
and most importantly totally disabuse yourself of trying to be cute with
code.  Code is read by the machine and by the programmer.  The programmer
is the one who should be deferred to, imo.  You buy the machine, you rent
the programmer by the hour!

Aside from django urls, I am not sure I ever wrote regexes in python.  For
some reason they must seem awfully sexy to quite a few people.  Back to my
point above -- ever try to figure out a complicated regex written by
someone else?

-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?

2013-12-06 Thread Rotwang


On 06/12/2013 16:51, Piotr Dobrogost wrote:

[...]

I thought of that argument later the next day. Your proposal does
unify access if the old obj.x syntax is removed.


As long as obj.x is a very concise way to get attribute named 'x' from
object obj it's somehow odd that identifier x is treated not like
identifier but like string literal 'x'. If it were treated like an
identifier then we would get attribute with name being value of x
instead attribute named 'x'. Making it possible to use string literals
in the form obj.'x' as proposed this would make getattr basically
needless as long as we use only variable not expression to denote
attribute's name.


But then every time you wanted to get an attribute with a name known at 
compile time you'd need to write obj.'x' instead of obj.x, thereby 
requiring two additional keystrokes. Given that the large majority of 
attribute access Python code uses dot syntax rather than getattr, this 
seems like it would massively outweigh the eleven keystrokes one saves 
by writing obj.'x' instead of getattr(obj,'x').


--
https://mail.python.org/mailman/listinfo/python-list

Re: Eliminate "extra" variable

2013-12-06 Thread Ethan Furman


On 12/06/2013 03:38 PM, Joel Goldstick wrote:

On Fri, Dec 6, 2013 at 2:37 PM, Igor Korot wrote:


def MyFunc(self, originalData):
  data = {}
  dateStrs = []
  for i in xrange(0, len(originalData)):
dateStr, freq, source = originalData[i]
data[str(dateStr)]  = {source: freq}

# above line confuses me!

dateStrs.append(dateStr)
 for i in xrange(0, len(dateStrs) - 1):
   currDateStr = str(dateStrs[i])
   nextDateStrs = str(dateStrs[i + 1])


Python lets you iterate over a list directly, so :

 for d in originalData:
 dateStr, freq, source = d
 data[source] = freq


You could shorten that to

   for dateStr, freq, source in originalData:

and if dateStr is already a string:

   data[dateStr] = {source: freq}


Your code looks like you come from a c background.  Python idioms are different


Agreed.



I'm not sure what you are trying to do in the second for loop, but I think you 
are trying to iterate thru a dictionary
in a certain order, and you can't depend on the order


The second loop is iterating over the list dateStrs.

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: Embedding multiple interpreters

Hi Gregory,

On 07/12/13 08:39, Gregory Ewing wrote:
> Garthy wrote:
>> To allow each script to run in its own environment, with minimal
>> chance of inadvertent interaction between the environments, whilst
>> allowing each script the ability to stall on conditions that will be
>> later met by another thread supplying the information, and to fit in
>> with existing infrastructure.
>
> The last time I remember this being discussed was in the context
> of allowing free threading. Multiple interpreters don't solve
> that problem, because there's still only one GIL and some
> objects are shared.

I am fortunate in my case as the normal impact of the GIL would be much 
reduced. The common case is only one script actively progressing at a 
time- with the others either not running or waiting for external input 
to continue.

But as you point out in your other reply, there are still potential 
concerns that arise from the smaller set of shared objects even across 
interpreters.

> But if all you want is for each plugin to have its own version
> of sys.modules, etc., and you're not concerned about malicious
> code, then it may be good enough.

I wouldn't say that I wasn't concerned about it entirely, but on the 
other hand it is not a hard requirement to which all other concerns are 
secondary.

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list

Re: One liners

2013-12-06 Thread Roy Smith

In article ,
 Joel Goldstick  wrote:

> Aside from django urls, I am not sure I ever wrote regexes in python.  For
> some reason they must seem awfully sexy to quite a few people.  Back to my
> point above -- ever try to figure out a complicated regex written by
> someone else?

Regex has a bad rap in the Python community.  To be sure, you can abuse 
them, and write horrible monstrosities.  On the other hand, stuff like 
this (slightly reformatted for posting):

pattern = re.compile(
r'haproxy\[(?P\d+)]: '
r'(?P(\d{1,3}\.){3}\d{1,3}):'
r'(?P\d{1,5}) '
r'\[(?P\d{2}/\w{3}/\d{4}(:\d{2}){3}\.\d{3})] '
r'(?P\S+) '
r'(?P\S+)/'
r'(?P\S+) '
r'(?P(-1|\d+))/'
r'(?P(-1|\d+))/'
r'(?P(-1|\d+))/'
r'(?P(-1|\d+))/'
r'(?P\+?\d+) '
r'(?P\d{3}) '
r'(?P\d+) '
r'(?P\S+) '
r'(?P\S+) '
r'(?P[\w-]{4}) '
r'(?P\d+)/'
r'(?P\d+)/'
r'(?P\d+)/'
r'(?P\d+)/'
r'(?P\d+) '
r'(?P\d+)/'
r'(?P\d+) '
r'(\{(?P.*?)\} )?'   # Comment out for stock haproxy
r'(\{(?P.*?)\} )?'
r'(\{(?P.*?)\} )?'
r'"(?P.+)"'
)

while intimidating at first glance, really isn't that hard to 
understand.  Python's raw string literals, adjacent string literal 
catenation, and automatic line continuation team up to eliminate a lot 
of extra fluff.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: interactive help on the base object

2013-12-06 Thread Terry Reedy

On 12/6/2013 12:03 PM, Mark Lawrence wrote:

Is it just me, or is this basically useless?

 >>> help(object)
Help on class object in module builtins:

class object
  |  The most base type

Given that this can be interpreted as 'least desirable', it could 
definitely be improved.

Surely a few more words,

How about something like.

'''The default top superclass for all Python classes.

Its methods are inherited by all classes unless overriden.
'''

When you have 1 or more concrete suggestions for the docstring, open a 
tracker issue.

> or a pointer to this

http://docs.python.org/3/library/functions.html#object, would be better?

URLs don't belong in docstrings. People should know how to find things 
in the manual index.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: Python 2.8 release schedule

2013-12-06 Thread Terry Reedy


On 12/6/2013 4:26 PM, Mark Lawrence wrote:

My apologies if you've seen this before but here is the official
schedule http://www.python.org/dev/peps/pep-0404/


The PEP number is not an accident ;-).
--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: Python 2.8 release schedule


On 07/12/2013 01:39, Terry Reedy wrote:

On 12/6/2013 4:26 PM, Mark Lawrence wrote:

My apologies if you've seen this before but here is the official
schedule http://www.python.org/dev/peps/pep-0404/


The PEP number is not an accident ;-).


Sorry but I don't get it :)

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Python 2.8 release schedule

On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence  wrote:
> On 07/12/2013 01:39, Terry Reedy wrote:
>>
>> On 12/6/2013 4:26 PM, Mark Lawrence wrote:
>>>
>>> My apologies if you've seen this before but here is the official
>>> schedule http://www.python.org/dev/peps/pep-0404/
>>
>>
>> The PEP number is not an accident ;-).
>
>
> Sorry but I don't get it :)

HTTP error 404 "Not Found", probably the most famous (though not the
most common) HTTP return code.

You asked for Python 2.8? Sorry, not found... it's 404.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python 2.8 release schedule


On 07/12/2013 01:54, Chris Angelico wrote:

On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence  wrote:

On 07/12/2013 01:39, Terry Reedy wrote:


On 12/6/2013 4:26 PM, Mark Lawrence wrote:


My apologies if you've seen this before but here is the official
schedule http://www.python.org/dev/peps/pep-0404/



The PEP number is not an accident ;-).



Sorry but I don't get it :)


HTTP error 404 "Not Found", probably the most famous (though not the
most common) HTTP return code.

You asked for Python 2.8? Sorry, not found... it's 404.

ChrisA



Clearly that went straight over your head.

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Python 2.8 release schedule

On Sat, Dec 7, 2013 at 1:00 PM, Mark Lawrence  wrote:
> On 07/12/2013 01:54, Chris Angelico wrote:
>>
>> On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence 
>> wrote:
>>> Sorry but I don't get it :)
>>
>> [explained the joke]
>
> Clearly that went straight over your head.

*facepalm* Yep, it did. Completely missed what you said there.

Doh. I see what you did there... now.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: One liners

On Fri, 06 Dec 2013 15:54:22 -0800, Dan Stromberg wrote:

> Does anyone else feel like Python is being dragged too far in the
> direction of long, complex, multiline one-liners?  Or avoiding temporary
> variables with descriptive names?  Or using regex's for everything under
> the sun?

All those things are stylistic issues, not language issues. Yes, I see 
far too many people trying to squeeze three lines of code into one, but 
that's their choice, not the language leading them that way.

On the other hand, Python code style is influenced strongly by functional 
languages like Lisp, Scheme and Haskell (despite the radically different 
syntax). Python has even been described approvingly as "Lisp without the 
brackets". To somebody coming from a C or Pascal procedural background, 
or a Java OOP background, such functional-style code might seem too 
concise and/or weird. But frankly, I think that such programmers would 
write better code with a more functional approach. I refuse to apologise 
for writing the one-liner:

result = [func(item) for item in sequence]

instead of four:

result = []
for i in range(len(sequence)):
item = sequence[i]
result.append(func(item))

> What happened to using classes?  What happened to the beautiful emphasis
> on readability?  What happened to debuggability (which is always harder
> than writing things in the first place)?  And what happened to string
> methods?

What about string methods?

As far as classes go, I find that they're nearly always overkill. Most of 
the time, a handful of pre-written standard classes, like dict, list, 
namedtuple and the like, get me 90% of the way to where I need to go.

The beauty of Python is that it is a multi-paradigm language. You can 
write imperative, procedural, functional, OOP, or pipelining style (and 
probably more). The bad thing about Python is that if you're reading 
other people's code you *need* to be familiar with all those styles.

> I'm pleased to see Python getting more popular, but it feels like a lot
> of newcomers are trying their best to turn Python into Perl or
> something, culturally speaking.

They're probably writing code using the idioms they are used to from 
whatever language they have come from. Newcomers nearly always do this. 
The more newcomers you get, the less Pythonic the code you're going to 
see from them.

-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches

2013-12-06 Thread Ned Batchelder

On 12/6/13 8:03 AM, rusi wrote:

I think you're off on the wrong track here.  This has nothing to do with
>plain text (ascii or otherwise).  It has to do with divorcing how you
>store and transport messages (be they plain text, HTML, or whatever)
>from how a user interacts with them.

Evidently (and completely inadvertently) this exchange has just
illustrated one of the inadmissable assumptions:

"unicode as a medium is universal in the same way that ASCII used to be"

I wrote a number of ellipsis characters ie codepoint 2026 as in:

   - human communication…
(is not very different from)
   - machine communication…

Somewhere between my sending and your quoting those ellipses became
the replacement character FFFD

> >   - human communication�
> >(is not very different from)
> >   - machine communication�

Leaving aside whose fault this is (very likely buggy google groups),
this mojibaking cannot happen if the assumption "All text is ASCII"
were to uniformly hold.

Of course with unicode also this can be made to not happen, but that
is fragile and error-prone.  And that is because ASCII (not extended)
is ONE thing in a way that unicode is hopelessly a motley inconsistent
variety.

You seem to be suggesting that we should stick to ASCII.  There are of 
course languages that need more than just the Latin alphabet.  How would 
you suggest we support them?  Or maybe I don't understand?

--Ned.

--
https://mail.python.org/mailman/listinfo/python-list

Re: One liners

On Fri, 06 Dec 2013 17:20:27 -0700, Michael Torrie wrote:

> On 12/06/2013 05:14 PM, Dan Stromberg wrote:
>> I'm thinking mostly of stackoverflow, but here's an example I ran into
>> (a lot of) on a job:
>> 
>> somevar = some_complicated_thing(somevar) if
>> some_other_complicated_thing(somevar) else somevar
>> 
>> Would it really be so bad to just use an if statement?  Why are we
>> assigning somevar to itself?  This sort of thing was strewn across 3 or
>> 4 physical lines at a time.

Unless you're embedding it in another statement, there's no advantage to 
using the ternary if operator if the clauses are so large you have to 
split the line over two or more lines in the first place. I agree that:

result = (spam(x) + eggs(x) + toast(x) 
  if x and condition(x) or another_condition(x)
  else foo(x) + bar(x) + foobar(x))

is probably better written as:

if x and condition(x) or another_condition(x):
result = spam(x) + eggs(x) + toast(x)
else:
result = foo(x) + bar(x) + foobar(x)

The ternary if is slightly unusual and unfamiliar, and is best left for 
when you need an expression:

ingredients = [spam, eggs, cheese, toast if flag else bread, tomato]

As for your second complaint, "why are we assigning somevar to itself", I 
see nothing wrong with that. Better that than a plethora of variables 
used only once:

# Screw this for a game of soldiers.
def function(arg, param_as_list_or_string):
if isinstance(param_as_list_or_string, str):
param = param_as_list_or_string.split()
else:
param = param_as_list_or_string

# Better.
def function(arg, param):
if isinstance(param, str):
param = param.split()

"Replace x with a transformed version of x" is a perfectly legitimate 
technique, and not one which ought to be too hard to follow.

> You're right that a conventional "if" block is not only more readable,
> but also faster and more efficient code.

Really? I don't think so. This is using Python 2.7:

[steve@ando ~]$ python -m timeit --setup="flag = 0" \
> "if flag: y=1
> else: y=2"
1000 loops, best of 3: 0.0836 usec per loop

[steve@ando ~]$ python -m timeit --setup="flag = 0" "y = 1 if flag else 2"
1000 loops, best of 3: 0.0813 usec per loop

There's practically nothing between the two, but the ternary if operator 
is marginally faster.

As for readability, I accept that ternary if is unusual compared to other 
languages, but it's still quite readable in small doses. If you start 
chaining them:

result = a if condition else b if flag else c if predicate else d 

you probably shouldn't.

-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: ASCII and Unicode [was Re: Managing Google Groups headaches]