date:20100125

Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-25 Thread M.-A. Lemburg

Antoine Pitrou wrote:
> Le samedi 23 janvier 2010 à 20:43 +0100, M.-A. Lemburg a écrit :
>>
>> Now, we cannot easily remove this guessing since we're in stable
>> mode again with 3.1. Perhaps we should add a way to at least be
>> able to switch off this guessing, so that applications can be
>> tested in a predictable way, rather than depending on the test
>> runner's locale settings ?!
> 
> The simple way to switch off the guessing is to specify an encoding in
> open(). I don't know what other means of switching it off could be
> added.

I was thinking of a way to disable the automatic guessing, so that
bugs related to missing encoding specifications can more easily be
found.

One way of doing this would be to have a global
text_file_default_encoding which is set to "guess-encoding"
per default.

This global could then be set via a PYTHONTEXTFILEENCODING
OS variable or programmatically at runtime to define the
text file default encoding to use in open() if no explicit
encoding is specified.

To disable guessing, the variable would be set to "unknown"
(like you can do for the default encoding in Python 2.x to
disable automatic coercion of strings to Unicode).

Perhaps we could also add a warning to the open() API which warns
in case a file is opened in text mode without specifying an
encoding ?!

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 25 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Rich Comparison recipe wrong?

2010-01-25 Thread Lennart Regebro

If you look at the description of the rich comparison methods in the
documentation:
http://docs.python.org/reference/datamodel.html#object.__lt__

It refers to a recipe: http://code.activestate.com/recipes/576529/

However, that recipe will convert a __ge__(self, other) into a other < self.
So when you call self <= other, you'll end up calling other < self.

So far, so good. But, the problem is another piece of text: "A rich
comparison method may return the singleton NotImplemented if it does
not implement the operation for a given pair of arguments. " If you do
this, what happens is that Python will try to swap the parameters
around, so it will convert a self < other into a other <= self. And
here we see the problem.

If class A returns NotImplemented when compared to class B, and class
B implements the recipe above, then we get infinite recursion, because

1. A() < B() will call A.__lt__(B) which will return NotImplemented.
2. which will mean that Python calls B.__ge__(A)
3. Which B implements by doing A < B
4. Start over at 1.


Have I missed something, or is this recipe incomplete by not handling
the NotImplemented case? If it is, I think the recipe should be
changed to something that handles it.

-- 
Lennart Regebro: http://regebro.wordpress.com/
Python 3 Porting: http://python-incompatibility.googlecode.com/
+33 661 58 14 64
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Using code with a public domain like license

2010-01-25 Thread Stefan Krah

Hi,

I would like to use this code from Hacker's Delight in cdecimal:

http://www.hackersdelight.org/HDcode/divlu.c


The divlu function is Knuth's algorithm D optimized for 64bit/32bit -> 32bit
divrem. It is going to be used for (hypothetical?) platforms without uint64_t.


The license is public domain like:

http://www.hackersdelight.org/permissions.htm


Is this license good enough for inclusion in Python?


Stefan Krah


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rich Comparison recipe wrong?

2010-01-25 Thread Nick Coghlan

Lennart Regebro wrote:
> 1. A() < B() will call A.__lt__(B) which will return NotImplemented.
> 2. which will mean that Python calls B.__ge__(A)
> 3. Which B implements by doing A < B
> 4. Start over at 1.
> 
> 
> Have I missed something, or is this recipe incomplete by not handling
> the NotImplemented case? If it is, I think the recipe should be
> changed to something that handles it.

I tested both the recipe linked from the docs and Raymond's shorter
recipe at http://code.activestate.com/recipes/576685/ and sure enough
both suffer from infinite recursion if the root method returns
NotImplemented when two of those items are being compared.

However, returning NotImplemented generally implies that A and B are
*different* classes, so I think this is more of a theoretical problem
than a practical one.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rich Comparison recipe wrong?

2010-01-25 Thread Lennart Regebro

On Mon, Jan 25, 2010 at 14:34, Nick Coghlan  wrote:
> However, returning NotImplemented generally implies that A and B are
> *different* classes

Which is exactly the case here.

> so I think this is more of a theoretical problem
> than a practical one.

How so? The whole point of returning NotImplemented is to give the
other class a go. But if that other class implements this recipe, you
get infinite recursion. It seems to me that it means that this recipe
is broken, as it doesn't handle the other class returning
NotImplemented.

-- 
Lennart Regebro: Python, Zope, Plone, Grok
http://regebro.wordpress.com/
+33 661 58 14 64
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rich Comparison recipe wrong?

2010-01-25 Thread Nick Coghlan

Lennart Regebro wrote:
> On Mon, Jan 25, 2010 at 14:34, Nick Coghlan  wrote:
>> However, returning NotImplemented generally implies that A and B are
>> *different* classes
> 
> Which is exactly the case here.

It wasn't in my tests though - I used the same class on both sides of
the comparison.

>> so I think this is more of a theoretical problem
>> than a practical one.
> 
> How so? The whole point of returning NotImplemented is to give the
> other class a go. But if that other class implements this recipe, you
> get infinite recursion. It seems to me that it means that this recipe
> is broken, as it doesn't handle the other class returning
> NotImplemented.

Ah, you mean the case where both classes implement the recipe, but know
nothing about each other and hence both return NotImplemented from their
root comparison?

OK, that sounds like a plausible real world problem with the recipe, but
I'm not sure how difficult it would be to fix. For homogeneous
collection of objects with nary a NotImplemented in site, it's still a
decent recipe.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rich Comparison recipe wrong?

2010-01-25 Thread Lennart Regebro

On Mon, Jan 25, 2010 at 15:30, Nick Coghlan  wrote:
> Ah, you mean the case where both classes implement the recipe, but know
> nothing about each other and hence both return NotImplemented from their
> root comparison?

Well, only one needs to return NotImplemented, actually.

> OK, that sounds like a plausible real world problem with the recipe, but
> I'm not sure how difficult it would be to fix.

I've failed. :-) I currently something like this instead:

class ComparableMixin(object):
def __lt__(self, other):
try:
return self.__cmp__(other) < 0
except TypeError:
return NotImplemented

def __le__(self, other):
try:
return self.__cmp__(other) <= 0
except TypeError:
return NotImplemented

def __gt__(self, other):
try:
return self.__cmp__(other) > 0
except TypeError:
return NotImplemented

def __ge__(self, other):
try:
return self.__cmp__(other) >= 0
except TypeError:
return NotImplemented

def __eq__(self, other):
try:
return self.__cmp__(other) == 0
except TypeError:
return NotImplemented

def __ne__(self, other):
try:
return self.__cmp__(other) != 0
except TypeError:
return NotImplemented

That does rely on a (deprecated) __cmp__ but it will never be  called
directly anyway.
The __cmp__ is then implemented something like this:

def __cmp__(self, other):
try:
return (self.v > other.v) - (self.v < other.v)
except AttributeError:
raise TypeError('Can not compare %s and %s' %
(type(self), type(other))

Since this doesn't "redirect" any of the __xx__ methods to another
one, the problem goes away.
But it uses a lot of annoying try/excepts. Ideas for improvements would be nice.

-- 
Lennart Regebro: http://regebro.wordpress.com/
Python 3 Porting: http://python-incompatibility.googlecode.com/
+33 661 58 14 64
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Using code with a public domain like license

2010-01-25 Thread Victor Stinner

Hi,

Le lundi 25 janvier 2010 13:49:48, Stefan Krah a écrit :
> The license is public domain like:
> 
> http://www.hackersdelight.org/permissions.htm
> 
> Is this license good enough for inclusion in Python?

"You are free to use, copy, and distribute any of the code on this web site, 
whether modified by you or not. You need not give attribution."

Yes, you can change this "license" to the Python license, but keep the 
original copyright.

-- 
Victor Stinner
http://www.haypocalc.com/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Using code with a public domain like license

2010-01-25 Thread Guido van Rossum

I would ask a lawyer. If the PSF's lawyer (Van Lindbergh) is okay
you're golden. Most lawyer don't like licenses that are clearly
written by laypersons like this one, so it may require some convincing
to do.

--Guido

On Mon, Jan 25, 2010 at 7:37 AM, Victor Stinner
 wrote:
> Hi,
>
> Le lundi 25 janvier 2010 13:49:48, Stefan Krah a écrit :
>> The license is public domain like:
>>
>> http://www.hackersdelight.org/permissions.htm
>>
>> Is this license good enough for inclusion in Python?
>
> "You are free to use, copy, and distribute any of the code on this web site,
> whether modified by you or not. You need not give attribution."
>
> Yes, you can change this "license" to the Python license, but keep the
> original copyright.
>
> --
> Victor Stinner
> http://www.haypocalc.com/
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyCon Keynote

2010-01-25 Thread Ron Adam



Guido van Rossum wrote:

Please mail me topics you'd like to hear me talk about in my keynote
at PyCon this year.


How about something on the differences and obstacles of using Python for 
developing full distributable applications vs small local scripts.


I'd like to see Python 3+ be more suitable for full distributable 
applications over 2.X and earlier.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Collin Winter

Hi Floris,

On Sun, Jan 24, 2010 at 3:40 AM, Floris Bruynooghe
 wrote:
> On Sat, Jan 23, 2010 at 10:09:14PM +0100, Cesare Di Mauro wrote:
>> Introducing C++ is a big step, also. Aside the problems it can bring on some
>> platforms, it means that C++ can now be used by CPython developers. It
>> doesn't make sense to force people use C for everything but the JIT part. In
>> the end, CPython could become a mix of C and C++ code, so a bit more
>> difficult to understand and manage.
>
> Introducing C++ is a big step, but I disagree that it means C++ should
> be allowed in the other CPython code.  C++ can be problematic on more
> obscure platforms (certainly when static initialisers are used) and
> being able to build a python without C++ (no JIT/LLVM) would be a huge
> benefit, effectively having the option to build an old-style CPython
> at compile time.  (This is why I ased about --without-llvm being able
> not to link with libstdc++).

I'm working on a patch to completely remove all traces of C++ with
configured with --without-llvm. It's a straightforward change, and
should present no difficulties.

For reference, what are these "obscure platforms" where static
initializers cause problems?

Thanks,
Collin Winter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-25 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

M.-A. Lemburg wrote:
> Antoine Pitrou wrote:
>> Le samedi 23 janvier 2010 à 20:43 +0100, M.-A. Lemburg a écrit :
>>> Now, we cannot easily remove this guessing since we're in stable
>>> mode again with 3.1. Perhaps we should add a way to at least be
>>> able to switch off this guessing, so that applications can be
>>> tested in a predictable way, rather than depending on the test
>>> runner's locale settings ?!
>> The simple way to switch off the guessing is to specify an encoding in
>> open(). I don't know what other means of switching it off could be
>> added.
> 
> I was thinking of a way to disable the automatic guessing, so that
> bugs related to missing encoding specifications can more easily be
> found.
> 
> One way of doing this would be to have a global
> text_file_default_encoding which is set to "guess-encoding"
> per default.
> 
> This global could then be set via a PYTHONTEXTFILEENCODING
> OS variable or programmatically at runtime to define the
> text file default encoding to use in open() if no explicit
> encoding is specified.
> 
> To disable guessing, the variable would be set to "unknown"
> (like you can do for the default encoding in Python 2.x to
> disable automatic coercion of strings to Unicode).

If it isn't disabled by default, then the people this bites won't ever
know what is happening:  they will stay happily ignorant until the
guessed encoding silently correpts their data.

> Perhaps we could also add a warning to the open() API which warns
> in case a file is opened in text mode without specifying an
> encoding ?!

That ounds like a good plan to me, given that backward-compatibility
requires keeping the guessing enabled by default.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktd4/oACgkQ+gerLs4ltQ4ZEQCeOMqOvJBuyNIqY/gOKQN0thbN
NoYAoJdctHukxA4nVvRCcmev9EFCcmBF
=4vN0
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Collin Winter wrote:

> For reference, what are these "obscure platforms" where static
> initializers cause problems?

It's been a long while since I had to deal with it, but the "usual
suspets" back in the day were HP-UX, AIX, and Solaris with non-GCC
compilers, as well as Windows when different VC RT libraries got into
the mix.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktd5pEACgkQ+gerLs4ltQ41aQCfXkZrJwIOt+wyeAquWKufwX/N
UmUAn3M/RNrwSLcZ94+Qtzjv9Yt6Q1tE
=40hq
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Collin Winter

Hi Cesare,

On Sat, Jan 23, 2010 at 1:09 PM, Cesare Di Mauro
 wrote:
> Hi Collin
>
> IMO it'll be better to make Unladen Swallow project a module, to be
> installed and used if needed, so demanding to users the choice of having it
> or not. The same way psyco does, indeed.
> Nowadays it requires too much memory, longer loading time, and fat binaries
> for not-so-great performances. I know that some issues have being worked on,
> but I don't think that they'll show something comparable to the current
> CPython status.

You're proposing that, even once the issues of memory usage and
startup time are addressed, Unladen Swallow should still be an
extension module? I don't see why. You're assuming that these issues
cannot be fixed, which I disagree with.

I think maintaining something like a JIT compiler out-of-line, as
Psyco is, causes long-term maintainability problems. Such extension
modules are forever playing catchup with the CPython code, depending
on implementation details that the CPython developers are right to
regard as open to change. It also limits what kind of optimizations
you can implement or forces those optimizations to be implemented with
workarounds that might be suboptimal or fragile. I'd recommend reading
the Psyco codebase, if you haven't yet.

As others have requested, we are working hard to minimize the impact
of the JIT so that it can be turned off entirely at runtime. We have
an active issue tracking our progress at
http://code.google.com/p/unladen-swallow/issues/detail?id=123.

> Introducing C++ is a big step, also. Aside the problems it can bring on some
> platforms, it means that C++ can now be used by CPython developers.

Which platforms, specifically? What is it about C++ on those platforms
that is problematic? Can you please provide details?

> It
> doesn't make sense to force people use C for everything but the JIT part. In
> the end, CPython could become a mix of C and C++ code, so a bit more
> difficult to understand and manage.

Whether CPython should allow wider usage of C++ or whether developer
should be "force[d]" to use C is not our decision, and is not part of
this PEP. With the exception of Python/eval.c, we deliberately have
not converted any CPython code to C++ so that if you're not working on
the JIT, python-dev's workflow remains the same. Even within eval.cc,
the only C++ parts are related to the JIT, and so disappear completely
with configured with --without-llvm (or if you're not working on the
JIT).

In any case, developers can easily tell which language to use based on
file extension. The compiler errors that would result from compiling
C++ with a C compiler would be a good indication as well.

> What I see is that LLVM is a too big project for the goal of having "just" a
> JIT-ed Python VM. It can be surely easier to use and integrate into CPython,
> but requires too much resources

Which resources do you feel that LLVM would tax, machine resources or
developer resources? Are you referring to the portions of LLVM used by
Unladen Swallow, or the entire wider LLVM project, including the
pieces Unladen Swallow doesn't use at runtime?

> (on the contrary, Psyco demands little
> resources, give very good performances, but seems to be like a mess to
> manage and extend).

This is not my experience. For the workloads I have experience with,
Psyco doubles memory usage while only providing a 15-30% speed
improvement. Psyco's benefits are not uniform.

Unladen Swallow has been designed to be much more maintainable and
easier to extend and modify than Psyco: the compiler and its attendant
optimizations are well-tested (see Lib/test/test_llvm.py, for one) and
well-documented (see Python/llvm_notes.txt for one). I think that the
project is bearing out the success of our design: Google's full-time
engineers are a small minority on the project at this point, and
almost all performance-improving patches are coming from non-Google
developers.

Thanks,
Collin Winter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

I am interested in creating a patch to make deleting elements from the front of 
Python list work in O(1) time by advancing the ob_item pointer.

The patch will probably be rejected, but I would like to try it anyway as an 
exercise in digging into the CPython source, and working through the process.  
My goal is to accompany the proposed patch with a PEP (which I also expect to 
be initially rejected, but which will hopefully be a useful contribution in 
terms of documenting the decision.)

The reason I am posting here is that there appears to be some history behind 
similar patches that I am not aware of, so if anybody can refer me to earlier 
patches, PEPS, discussion threads, etc., I would be grateful.  

I am aware of PEP 3128, which has similar goals to what I'm trying to achieve, 
but there are also some differences.

The blist implementation described in PEP 3128 achieves the goal of reducing 
time complexity for some operations, but necessarily at the expense of other 
operations, notably list access.

The patch that I am considering would not affect time complexity for any other 
operations, nor memory complexity, but it would, of course, have marginal costs 
in certain operations, notably any operation that eventually calls 
list_resize().  I am confident that the patch would not impact performance of 
list accesses at all.  The memory cost for the list itself would be an 
additional pointer or integer, which I think should be considered negligible 
compared to the cost of the list itself [O(N)] and the elements in the list 
[O(N)].  I haven't completely worked out the best strategy to eventually 
release the memory taken up by the pointers of the unreleased elements, but the 
worst case scenario is that the unused memory only gets wasted until the time 
that the list itself gets garbage collected.  I think I can do better than 
that, at some cost of complicating list_resize.  From a memory standpoint, the 
benefits of encouraging deleting items from the front
 of the list should outweigh any disadvantages with respect to lazily releasing 
memory from the pointers at the front of the list, since in deleting elements, 
you allow the elements themselves to be garbage collected earlier, as well as 
objects that might be referenced by those elements.

My goal would be to target the patch at 3.x, and if I was lucky enough to get 
it accepted, I think it could eventually be backported to 2.x.  The proposal 
does not affect the definition of the language itself, of course; it merely 
attempts to improve the performance of the CPython implementation.

The instructions that I have found for setting up your development environment 
seemed to be targeted at 2.x trunk, which is fine with me.  I will attempt the 
patch off  the 2.x trunk to get an initial feel for the issues involved, unless 
somebody counsels me to work off 3.x right from the start.

http://www.python.org/dev/setup/

I have not been able to locate the source code for 3.x.  Is the implementation 
of list more or less the same there?

There is a long thread entitled "list.pop(0) vs. collections.dequeue" on 
comp.lang.python that addresses alternatives to list.pop(0), but none of them 
really fit my use case.

Here is a sketch of the PEP that I would propose:


Proposal: 


Improve list's implementation so that deleting elements from 
the front of the list does not require an O(N) memmove operation. 


Rationale: 


Some Python programs that process lists have multiple 
methods that consume the first element of the list and pop it off. 
The pattern comes up with parsers in particular, but there are other 
examples.  It is possible now, of course, to use a data structure in 
Python that has O(1) for deleting off the top of the list, but none of 
the alternatives fully replicate the benefits of list itself. 


Specification: 


Improving CPython's performance does not affect the 
language itself, so there are no bikeshed arguments to be had with 
respect to syntax, etc.  Any patch would, of course, affect the 
performance of nearly every Python program in existence, so any patch 
would have to, at a bare minimum: 


  1) Not increase the time or memory complexity of any other list 
operation. 
  2) Not affect list access at all. 
  3) Minimally affect list operations that mutate the list. 
  4) Be reasonably simple within CPython itself. 
  5) Not be grossly wasteful of memory. 


Backwards Compatibility: 


See above.  An implementation of this PEP would not change the 
definition of the language in any way, but it would have to minimally 
impact the performance of lists for the normal use cases. 


Implementation: 


There are two ways to make deleting the first item of the list run 
more efficiently. 


The most ambitious proposal is to fix the memory manager itself to 
allow the release of memory from the start of the chunk.  The 
advantage of this proposal is that it would simplify the changes to 
list itself, and possibly have collateral b

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Daniel Stutzbach

On Mon, Jan 25, 2010 at 1:22 PM, Steve Howell  wrote:

> I haven't completely worked out the best strategy to eventually release the
> memory taken up by the pointers of the unreleased elements, but the worst
> case scenario is that the unused memory only gets wasted until the time that
> the list itself gets garbage collected.
>

FWIW, for a long-running FIFO queue, it's critical to release some of the
memory along the way, otherwise the amount of wasted memory is unbounded.

Good luck :)

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Raymond Hettinger

On Jan 25, 2010, at 11:22 AM, Steve Howell wrote:

> I am interested in creating a patch to make deleting elements from the front 
> of Python list work in O(1) time by advancing the ob_item pointer.

+1 on doing whatever experiments you feel like doing
-1 on putting something like this in the core

1) To many things in the Python world rely on the current implementation of 
lists.  It's not worth breaking third-party extensions, tools like psyco, work 
on unladen swallow, and other implementations of Python such as PyPy and Jython.

2). The use of lists pervades the language and it doesn't make sense to have 
the language as a whole pay a price (in terms of speed and space) for every 
list that gets created.  The corner case isn't worth it.

3).  We already got one.  The collections.deque() class was introduced 
specifically to handle inserting and popping from the front of a list 
efficiently.

4).  In the comp.lang.python thread on this subject, you've gotten nearly zero 
support for your position and have managed to offend many of the developers.

Raymond___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Benjamin Peterson

2010/1/25 Steve Howell :
> I am interested in creating a patch to make deleting elements from the front
> of Python list work in O(1) time by advancing the ob_item pointer.

How about just using a deque?



-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Using code with a public domain like license

2010-01-25 Thread Stefan Krah

Thanks. I see that you've cc'd the PSF already, so I'll wait a while and
ask them directly if I don't hear anything.


Stefan Krah


Guido van Rossum  wrote:
> I would ask a lawyer. If the PSF's lawyer (Van Lindbergh) is okay
> you're golden. Most lawyer don't like licenses that are clearly
> written by laypersons like this one, so it may require some convincing
> to do.
> 
> --Guido
> 
> On Mon, Jan 25, 2010 at 7:37 AM, Victor Stinner
>  wrote:
> > Hi,
> >
> > Le lundi 25 janvier 2010 13:49:48, Stefan Krah a écrit :
> >> The license is public domain like:
> >>
> >> http://www.hackersdelight.org/permissions.htm
> >>
> >> Is this license good enough for inclusion in Python?
> >
> > "You are free to use, copy, and distribute any of the code on this web site,
> > whether modified by you or not. You need not give attribution."
> >
> > Yes, you can change this "license" to the Python license, but keep the
> > original copyright.
> >
> > --
> > Victor Stinner
> > http://www.haypocalc.com/
> > ___
> > Python-Dev mailing list
> > [email protected]
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: 
> > http://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/stefan-usenet%40bytereef.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Jeffrey Yasskin

On Mon, Jan 25, 2010 at 10:44 AM, Tres Seaver  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Collin Winter wrote:
>
>> For reference, what are these "obscure platforms" where static
>> initializers cause problems?
>
> It's been a long while since I had to deal with it, but the "usual
> suspets" back in the day were HP-UX, AIX, and Solaris with non-GCC
> compilers, as well as Windows when different VC RT libraries got into
> the mix.

So then the question is, will this cause any problems we care about?
Do the problems still exist, or were they eliminated in the time
between "back in the day" and now? In what circumstances do static
initializers have problems? What problems do they have? Can the
obscure platforms work around the problems by configuring with
--without-llvm? If we eliminate static initializers in LLVM, are there
any other problems?

We really do need precise descriptions of the problems so we can avoid them.

Thanks,
Jeffrey
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

--- On Mon, 1/25/10, Daniel Stutzbach  wrote:

> FWIW, for a long-running FIFO queue, it's critical to
> release some of the memory along the way, otherwise the
> amount of wasted memory is unbounded.
> 

Somebody implementing a long-running FIFO queue should actually be using deque 
instead of list, but I agree with your greater point that waiting until the 
list gets garbage-collected to release memory would probably be unacceptable, 
so I'll see what I can do.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

From: Raymond Hettinger 
> On Jan 25, 2010, at 11:22 AM, Steve Howell
> wrote:
> I
> am interested in creating a patch to make deleting elements
> from the front of Python list work in O(1) time by advancing
> the ob_item pointer.
>
> +1 on doing whatever experiments you feel like
> doing-1 on putting something like this in the
> core
> 1) To many things in the Python world rely on
> the current implementation of lists.  It's not
> worth breaking third-party extensions, tools like psyco,
> work on unladen swallow, and other implementations of Python
> such as PyPy and Jython.

I don't understand how changing the implementation of CPython would impact PyPy 
and Jython, unless you are just referring to the fact that CPython is treated 
as a reference implementation, so its simplicity is a virtue for other ports.  
Am I missing something else?

> 2). The use of lists pervades the language and
> it doesn't make sense to have the language as a whole
> pay a price (in terms of speed and space) for every list
> that gets created.  The corner case isn't worth
> it.

I understand the tradeoff.

> 3).  We already got one.  The
> collections.deque() class was introduced specifically to
> handle inserting and popping from the front of a list
> efficiently.

I understand that deque solves some problems that list does not.  Obviously, it 
allows you to delete elements off the front in O(1) time, but it also has other 
advantages, such as allowing you to rotate elements efficiently.  It's 
designed, I am guessing, for FIFO queues, and it's a perfectly good data 
structure, just not one that is well suited for all use cases.

> 4).  In the comp.lang.python thread on this
> subject, you've gotten nearly zero support for your
> position and have managed to offend many of the
> developers.

Fair enough.  I think the idea is at least worthy of a PEP that puts forward 
the strongest case possible.

Terry Jan Reedy wrote:

'''
I am not opposed to a possible change, just hasty, ill-informed
criticism. If there is not a PEP on this issue, it would be good to have
one that recorded the proposal and the pros and cons, regardless of the
outcome, so there would be something to refer people to. If that had
been already done, it would have shortened this thread considerably.
'''

Do you agree that it is at least worthwhile to write a PEP here?  It could be 
fairly short and quickly rejected, and down the road, if more people get behind 
it, it could always be strengthened and revisited.

There seems to be at least some precedence for PEPs that only pertain to 
internal implementation details, such as this one:

http://www.python.org/dev/peps/pep-0267/

(Raymond, apologies for my double reply...the first one was meant to go to the 
list, not directly to you.)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

--- On Mon, 1/25/10, Benjamin Peterson  wrote:

> 2010/1/25 Steve Howell :
> > I am interested in creating a patch to make deleting
> elements from the front
> > of Python list work in O(1) time by advancing the
> ob_item pointer.
>
> How about just using a deque?

Deque does not support all the operations that list does.  It is also roughly 
twice as slow for accessing elements (I've measured it).

>>> lst = ['foo', 'bar', 'baz']
>>> lst[1:]
['bar', 'baz']
>>> lst.insert(0, 'spam')
>>> lst
['spam', 'foo', 'bar', 'baz']


>>> d = deque(lst)
>>> d[2:]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: sequence index must be integer, not 'slice'
>>> d.insert(0, 'eggs')
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'collections.deque' object has no attribute 'insert'


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Raymond Hettinger

On Jan 25, 2010, at 12:36 PM, Steve Howell wrote:

> 
> Deque does not support all the operations that list does.  It is also roughly 
> twice as slow for accessing elements (I've measured it).

ISTM that apps that want to insert or pop from the front of list are also apps 
that don't care about accessing arbitrary elements in the middle using the 
position index.  When lists are growing or shrinking from the front, the 
meaning of the i-th element changes.   So, it doesn't make sense for an 
application to track indices of objects in such a list.

   i = s.find('abc')
   s.pop(0)
   print s[i]# i no longer points at 'abc'

Raymond
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Floris Bruynooghe

On Mon, Jan 25, 2010 at 10:14:35AM -0800, Collin Winter wrote:
> I'm working on a patch to completely remove all traces of C++ with
> configured with --without-llvm. It's a straightforward change, and
> should present no difficulties.

Great to hear that, thanks for caring.

> For reference, what are these "obscure platforms" where static
> initializers cause problems?

I've had serious trouble on AIX 5.3 TL 04 with a GCC toolchain
(apparently the IBM xlc toolchain is better for that instance).  The
problem seems to be that gcc stores the initialisation code in a
section (_GLOBAL__DI IIRC) which the system loader does not execute.
Altough this was involving dlopen() from a C main() which U-S would
not need AFAIK, having a C++ main() might make the loader do the right
thing.  I must also note that on more recent versions (TL 07) this was
no problem at all.  But you don't always have the luxury of being able
to use recent OSes.

Regards
Floris

PS: For completeness sake this was trying to use the omniorbpy module
with Python 2.5.

-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

> From: Raymond Hettinger 
>
> On Jan 25, 2010, at 12:36 PM, Steve Howell wrote:
>
> >
> > Deque does not support all the operations that list
> does.  It is also roughly twice as slow for accessing
> elements (I've measured it).
>
>
> ISTM that apps that want to insert or pop from the front of
> list are also apps that don't care about accessing arbitrary
> elements in the middle using the position index.  When
> lists are growing or shrinking from the front, the meaning
> of the i-th element changes.   So, it doesn't
> make sense for an application to track indices of objects in
> such a list.
>
>i = s.find('abc')
>s.pop(0)
>print s[i]# i no longer
> points at 'abc'
>

I am not going to directly address your point, but I'd like to give a examples 
of code that uses pop(0) from the standard library.

If you look at the code for multiprocessing/connection.py, you will see that 
PipeListener creates _handle_queue as an ordinary Python list, and in line 317 
it uses pop(0) to pop the first handle off the top of the queue.

Why does that code not use a deque?  In hindsight, it probably should.  But to 
make the change now, it's not a simple matter of fixing just PipeListener, 
because PipeListener passes off _handle_queue to Finalize, which also expects a 
list.

In order to understand why Finalize expects a list, you need to look at how it 
uses args, and here is one example usage:

res = self._callback(*self._args, **self._kwargs)

Ok, so now you need to know what self._callback is doing, so now you have to 
trace through all callers of Finalize are passing in for their args.

So what seems like a trivial matter--switching over a list to a deque--actually 
requires a lot of thinking.

It turns out that the callback for PipeListener just iterates through the 
remaining handles and closes them.  So a deque would not break that code. 

If you look at difflib.py, it also does pop(0) in a loop.  Why doesn't it use a 
deque?  Simplicity, maybe?

codecs.py also deletes from the top of the list:

del self.linebuffer[0]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Floris Bruynooghe

On Mon, Jan 25, 2010 at 11:48:56AM -0800, Jeffrey Yasskin wrote:
> On Mon, Jan 25, 2010 at 10:44 AM, Tres Seaver  wrote:
> > Collin Winter wrote:
> >
> >> For reference, what are these "obscure platforms" where static
> >> initializers cause problems?
> >
> > It's been a long while since I had to deal with it, but the "usual
> > suspets" back in the day were HP-UX, AIX, and Solaris with non-GCC
> > compilers, as well as Windows when different VC RT libraries got into
> > the mix.
> 
> So then the question is, will this cause any problems we care about?
> Do the problems still exist, or were they eliminated in the time
> between "back in the day" and now? In what circumstances do static
> initializers have problems? What problems do they have? Can the
> obscure platforms work around the problems by configuring with
> --without-llvm? If we eliminate static initializers in LLVM, are there
> any other problems?

When Collin's patch is finished everything will be lovely since if
there's no C++ then there's no problem.  Since I was under the
impression that the JIT/LLVM can't emit machine code for the platforms
where these C++ problems would likely occur nothing would be lost.  So
trying to change the LLVM to avoid static initialisers would not seem
like a good use of someones time.

> We really do need precise descriptions of the problems so we can avoid them.

Sometimes these "precise descriptions" are hard to come by.  :-)


Regards
Floris

-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Martin v. Löwis

> We really do need precise descriptions of the problems so we can avoid them.

One family of problems is platform lack of initializer support in the
object file format; any system with traditional a.out (or b.out) is
vulnerable (also, COFF is, IIRC).

The solution e.g. g++ came up with is to have the collect2 linker
replacement combine all such initializers into a synthesized function
__main; this function then gets "magically" called by main(), provided
that main() itself gets compiled by a C++ compiler. Python used to have
a ccpython.cc entry point to support such systems.

This machinery is known to fail in the following ways:
a) main() is not compiled with g++: static objects get not constructed
b) code that gets linked into shared libraries (assuming the system
   supports them) does not get its initializers invoked.
c) compilation of main() with a C++ compiler, but then linking with ld
   results in an unresolved symbol __main.

Not sure whether U-S has any global C++ objects that need construction
(but I would be surprised if it didn't).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Martin v. Löwis

Floris Bruynooghe wrote:
> Since I was under the
> impression that the JIT/LLVM can't emit machine code for the platforms
> where these C++ problems would likely occur nothing would be lost.

AFAICT, LLVM doesn't support Itanium or HPPA, and apparently not POWER,
either (although they do support PPC - not sure what that means for
POWER). So that rules out AIX and HP-UX as sources of problems (beyond
the problems that they cause already :-)

LLVM does support SPARC, so I'd be curious about reports that it worked
or didn't work on a certain Solaris release, with either SunPRO or the
gcc release that happened to be installed on the system (Solaris
installation sometimes feature fairly odd g++ versions).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeffrey Yasskin wrote:
> On Mon, Jan 25, 2010 at 10:44 AM, Tres Seaver  wrote:
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> Collin Winter wrote:
>>
>>> For reference, what are these "obscure platforms" where static
>>> initializers cause problems?
>> It's been a long while since I had to deal with it, but the "usual
>> suspets" back in the day were HP-UX, AIX, and Solaris with non-GCC
>> compilers, as well as Windows when different VC RT libraries got into
>> the mix.
> 
> So then the question is, will this cause any problems we care about?
> Do the problems still exist, or were they eliminated in the time
> between "back in the day" and now? In what circumstances do static
> initializers have problems? What problems do they have? Can the
> obscure platforms work around the problems by configuring with
> --without-llvm? If we eliminate static initializers in LLVM, are there
> any other problems?
> 
> We really do need precise descriptions of the problems so we can avoid them.

Yup, sorry:  I was trying to kick in the little I could remember, but it
has been eight years since I wrote / compiled C++ in anger (a decade of
work in it before that).  MvLs reply sounds *exactly* like the memories
I have, though.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkteGGcACgkQ+gerLs4ltQ5SHwCfcQOswX0StFS32U3fFE6RZ5rr
z0QAmgKUECEhdZPQhgsNACkRiWrWX0t0
=eXM+
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Jeffrey Yasskin

On Mon, Jan 25, 2010 at 1:50 PM, "Martin v. Löwis"  wrote:
>> We really do need precise descriptions of the problems so we can avoid them.
>
> One family of problems is platform lack of initializer support in the
> object file format; any system with traditional a.out (or b.out) is
> vulnerable (also, COFF is, IIRC).
>
> The solution e.g. g++ came up with is to have the collect2 linker
> replacement combine all such initializers into a synthesized function
> __main; this function then gets "magically" called by main(), provided
> that main() itself gets compiled by a C++ compiler. Python used to have
> a ccpython.cc entry point to support such systems.
>
> This machinery is known to fail in the following ways:
> a) main() is not compiled with g++: static objects get not constructed
> b) code that gets linked into shared libraries (assuming the system
>   supports them) does not get its initializers invoked.
> c) compilation of main() with a C++ compiler, but then linking with ld
>   results in an unresolved symbol __main.

Thank you for the details. I'm pretty confident that (a) and (c) will
not be a problem for the Unladen Swallow merge because we switched
python.c (which holds main()) and linking to use the C++ compiler when
LLVM's enabled at all:
http://code.google.com/p/unladen-swallow/source/browse/trunk/Makefile.pre.in.
Python already had some support for this through the LINKCC configure
variable, but it wasn't being used to compile main(), just to link.

(b) could be a problem if we depend on LLVM as a shared library on one
of these platforms (and, of course, if LLVM's JIT supports these
systems at all). The obvious answers are: 1) --without-llvm on these
systems, 2) link statically on these systems, 3) eliminate the static
constructors. There may also be less obvious answers.

> Not sure whether U-S has any global C++ objects that need construction
> (but I would be surprised if it didn't).

I checked with them, and LLVM would welcome patches to remove these if
we need to. They already have a llvm::ManagedStatic class
(http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/ManagedStatic.h?view=markup)
with no constructor that we can replace most of them with.

Jeffrey
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Summary of 2 years of Python fuzzing

2010-01-25 Thread Victor Stinner

Hi,

I'm running regulary my fuzzer (Fusil) on CPython since summer 2008: I tested 
Python 2.5, 2.6, 2.7, 3.0, 3.1 and 3.2. I'm only looking for "fatal errors": 
Python process killed by a signal, or sometimes fuzzer timeouts. I ignore most 
timeout results because most of them are valid function calls reading 
from/writing to a file or socket. My goal is to improve Python security: 
protect it against malicious data injection and denial of service. I prefer 
fuzzing to static code analyze because it finds few false positives and it 
directly generate a script reproducing the crash. Fuzzing is just one tool 
helping to improve the global security.


Bugs found in CPython by Fusil
==

Modules
---

Fatal errors were only found in modules written in C. Modules: __builtin__ 
(5), json (4), io (3), bsddb (3), sqlite3 (3), audioop (2), locale (2), 
cProfile (2), Tkinter (2), dl, struct, binascii, testcapi, cPickle, 
multibytecodec, ctypes, hotshot, bz2, thread, bisect, weakref, imageop, 
multiprocessing.

  __builtin__: Exception, str, unicode, bytearray and long
  io: BytesIO, StringIO and FileIO

It looks like json, bsddb and sqlite3 are young and not enough tested. audioop 
and imageop bugs are the most critical because they lead to writing to 
uninitialized memory (which might allow to execute arbitrary code).

This module list gives also a first idea of which modules should be 
blacklisted in a sandbox ;-)

Cause
-

The most common causes are insufficient input validation and invalid/missing 
error handling.

"Insufficient input validation" means that the function is vulnerable to 
malicious data injection. "Invalid error handling"  means that the function 
causes a new error while trying to cleanup data (eg. release memory of an 
uninitialized variable). "Missing error handling" means that a function result 
is an error but the caller doesn't check the function result.

I don't have a generic solution to detect these problems. Except for "missing 
error handling": gcc has an extension to the C language to indidate that the 
result have to be used, __attribute__((warn_unused_result)). The GNU libc uses 
it the avoid common bugs.


Consequence
---

The most common consequence is to read from/write to uninitialized memory 
(especially reading from a NULL pointer) which lead sometimes to a 
segmentation fault (heisenbugs!). The second most common consequence is an 
unexpected exception during garbage collection: it displays a Fatal Python 
error and quits Python.

I would suggest to log unexpected exception during garbage collection without 
stopping the whole Python process, as done for exceptions in a destructor.

Details
---

Full list of all bugs found by Fusil with links to the bugtracker and to the 
commits:

   http://bitbucket.org/haypo/fusil/wiki/Python


Interaction with the Python developers
==

I open an issue for each bug found in CPython. I describe how to reproduce it 
and try to write a patch. I have learn to always write an unit test, useful to 
reproduce the bug, and it makes Python commiters happy :-)

The reaction depends on the impacted component, the severity of the bug, the 
complexity of the code reproducing the bug, and the quality of my bug report 
:-) The answer was always quick for core components. But some modules are 
maintained by everyone, which means nobody, like imageop, audioop or 
cProfile/hotshot. Having a module maitainer, like Guilherme Polo aka gpolo for 
Tkiner, does really help!

It looks like fuzzing bugs are not always appreciated by developers, maybe 
because they are always "borderline" cases (not "realist").

Sometimes, even if I write a patch, an unit test, explain the problem and the 
solution, I don't get any comment. It doesn't motivate me to continue fuzzing 
:-/


Play with Fusil at home
===

If you would like to fuzz Python with Fusil: download the last version of 
Fusil and run PYTHON fusil-python as root, where PYTHON is your python 
interpreter. Use --success=50 to wait for 50 crashs before stopping, --fast to 
speed up the fuzzing but slow down your computer, and --only-c to test only 
Python modules written in C.

   http://bitbucket.org/haypo/fusil/wiki/Home

Fusil is running as the user fusil and group fusil to avoid removing arbitrary 
file or killing an arbitrary process, that's why you need to run it as root.

If Fusil found a crash, you can analyze it while Fusil is running. Go into 
python// and read stdout and session.log files. Use "sudo 
./replay.py --gdb" command to "replay" the crash in gdb (--valgrind option can 
also be useful).

I'm only working on Linux, but Fusil works on any UNIX/BSD OS. Don't use Fusil 
on Windows! It might work on Windows but without any protection for your files 
and processes!


I hope that my fuzzing tests helped Python project, and may be someone else 
would help me to continue these tests ;-)

Victor Stinner

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Martin v. Löwis

>> a) main() is not compiled with g++: static objects get not constructed
>> b) code that gets linked into shared libraries (assuming the system
>>   supports them) does not get its initializers invoked.
>> c) compilation of main() with a C++ compiler, but then linking with ld
>>   results in an unresolved symbol __main.
> 
> Thank you for the details. I'm pretty confident that (a) and (c) will
> not be a problem for the Unladen Swallow merge because we switched
> python.c (which holds main()) and linking to use the C++ compiler when
> LLVM's enabled at all:
> http://code.google.com/p/unladen-swallow/source/browse/trunk/Makefile.pre.in.
> Python already had some support for this through the LINKCC configure
> variable, but it wasn't being used to compile main(), just to link.

Ok - but then this *is* a problem for people embedding Python on such
systems. They have to adjust their build process as well.
(they probably will need to make adjustments for C++ even on ELF
systems, as you still need to run C++ as the linker when linking
libpythonxy.a, to get the C++ runtime linked)

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Summary of 2 years of Python fuzzing

2010-01-25 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Victor Stinner wrote:
> Hi,
> 
> I'm running regulary my fuzzer (Fusil) on CPython since summer 2008: I tested 
> Python 2.5, 2.6, 2.7, 3.0, 3.1 and 3.2. I'm only looking for "fatal errors": 
> Python process killed by a signal, or sometimes fuzzer timeouts. I ignore 
> most 
> timeout results because most of them are valid function calls reading 
> from/writing to a file or socket. My goal is to improve Python security: 
> protect it against malicious data injection and denial of service. I prefer 
> fuzzing to static code analyze because it finds few false positives and it 
> directly generate a script reproducing the crash. Fuzzing is just one tool 
> helping to improve the global security.
> 
> 
> Bugs found in CPython by Fusil
> ==
> 
> Modules
> ---
> 
> Fatal errors were only found in modules written in C. Modules: __builtin__ 
> (5), json (4), io (3), bsddb (3), sqlite3 (3), audioop (2), locale (2), 
> cProfile (2), Tkinter (2), dl, struct, binascii, testcapi, cPickle, 
> multibytecodec, ctypes, hotshot, bz2, thread, bisect, weakref, imageop, 
> multiprocessing.
> 
>   __builtin__: Exception, str, unicode, bytearray and long
>   io: BytesIO, StringIO and FileIO
> 
> It looks like json, bsddb and sqlite3 are young and not enough tested. 
> audioop 
> and imageop bugs are the most critical because they lead to writing to 
> uninitialized memory (which might allow to execute arbitrary code).
> 
> This module list gives also a first idea of which modules should be 
> blacklisted in a sandbox ;-)
> 
> Cause
> -
> 
> The most common causes are insufficient input validation and invalid/missing 
> error handling.
> 
> "Insufficient input validation" means that the function is vulnerable to 
> malicious data injection. "Invalid error handling"  means that the function 
> causes a new error while trying to cleanup data (eg. release memory of an 
> uninitialized variable). "Missing error handling" means that a function 
> result 
> is an error but the caller doesn't check the function result.
> 
> I don't have a generic solution to detect these problems. Except for "missing 
> error handling": gcc has an extension to the C language to indidate that the 
> result have to be used, __attribute__((warn_unused_result)). The GNU libc 
> uses 
> it the avoid common bugs.
> 
> 
> Consequence
> ---
> 
> The most common consequence is to read from/write to uninitialized memory 
> (especially reading from a NULL pointer) which lead sometimes to a 
> segmentation fault (heisenbugs!). The second most common consequence is an 
> unexpected exception during garbage collection: it displays a Fatal Python 
> error and quits Python.
> 
> I would suggest to log unexpected exception during garbage collection without 
> stopping the whole Python process, as done for exceptions in a destructor.
> 
> Details
> ---
> 
> Full list of all bugs found by Fusil with links to the bugtracker and to the 
> commits:
> 
>http://bitbucket.org/haypo/fusil/wiki/Python
> 
> 
> Interaction with the Python developers
> ==
> 
> I open an issue for each bug found in CPython. I describe how to reproduce it 
> and try to write a patch. I have learn to always write an unit test, useful 
> to 
> reproduce the bug, and it makes Python commiters happy :-)
> 
> The reaction depends on the impacted component, the severity of the bug, the 
> complexity of the code reproducing the bug, and the quality of my bug report 
> :-) The answer was always quick for core components. But some modules are 
> maintained by everyone, which means nobody, like imageop, audioop or 
> cProfile/hotshot. Having a module maitainer, like Guilherme Polo aka gpolo 
> for 
> Tkiner, does really help!
> 
> It looks like fuzzing bugs are not always appreciated by developers, maybe 
> because they are always "borderline" cases (not "realist").
> 
> Sometimes, even if I write a patch, an unit test, explain the problem and the 
> solution, I don't get any comment. It doesn't motivate me to continue fuzzing 
> :-/
> 
> 
> Play with Fusil at home
> ===
> 
> If you would like to fuzz Python with Fusil: download the last version of 
> Fusil and run PYTHON fusil-python as root, where PYTHON is your python 
> interpreter. Use --success=50 to wait for 50 crashs before stopping, --fast 
> to 
> speed up the fuzzing but slow down your computer, and --only-c to test only 
> Python modules written in C.
> 
>http://bitbucket.org/haypo/fusil/wiki/Home
> 
> Fusil is running as the user fusil and group fusil to avoid removing 
> arbitrary 
> file or killing an arbitrary process, that's why you need to run it as root.
> 
> If Fusil found a crash, you can analyze it while Fusil is running. Go into 
> python// and read stdout and session.log files. Use "sudo 
> ./replay.py --gdb" command to "replay" the crash in gdb (--valgrind option 
> can 
> also b

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Benjamin Peterson

2010/1/25 Steve Howell :
>> From: Raymond Hettinger 
>>
>> On Jan 25, 2010, at 12:36 PM, Steve Howell wrote:
>>
>> >
>> > Deque does not support all the operations that list
>> does.  It is also roughly twice as slow for accessing
>> elements (I've measured it).
>>
>>
>> ISTM that apps that want to insert or pop from the front of
>> list are also apps that don't care about accessing arbitrary
>> elements in the middle using the position index.  When
>> lists are growing or shrinking from the front, the meaning
>> of the i-th element changes.   So, it doesn't
>> make sense for an application to track indices of objects in
>> such a list.
>>
>>    i = s.find('abc')
>>    s.pop(0)
>>    print s[i]    # i no longer
>> points at 'abc'
>>
>
> I am not going to directly address your point, but I'd like to give a 
> examples of code that uses pop(0) from the standard library.
>
> If you look at the code for multiprocessing/connection.py, you will see that 
> PipeListener creates _handle_queue as an ordinary Python list, and in line 
> 317 it uses pop(0) to pop the first handle off the top of the queue.
>
> Why does that code not use a deque?  In hindsight, it probably should.  But 
> to make the change now, it's not a simple matter of fixing just PipeListener, 
> because PipeListener passes off _handle_queue to Finalize, which also expects 
> a list.
>
> In order to understand why Finalize expects a list, you need to look at how 
> it uses args, and here is one example usage:
>
> res = self._callback(*self._args, **self._kwargs)
>
> Ok, so now you need to know what self._callback is doing, so now you have to 
> trace through all callers of Finalize are passing in for their args.
>
> So what seems like a trivial matter--switching over a list to a 
> deque--actually requires a lot of thinking.
>
> It turns out that the callback for PipeListener just iterates through the 
> remaining handles and closes them.  So a deque would not break that code.
>
> If you look at difflib.py, it also does pop(0) in a loop.  Why doesn't it use 
> a deque?  Simplicity, maybe?
>
> codecs.py also deletes from the top of the list:
>
> del self.linebuffer[0]

Yes, but in either of these cases is there an excellent performance
improvement to be gained and is it worth the complexity of your
optimization? I think not.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Mike Klaas

On Mon, Jan 25, 2010 at 11:32 AM, Daniel Stutzbach
 wrote:
> On Mon, Jan 25, 2010 at 1:22 PM, Steve Howell  wrote:
>>
>> I haven't completely worked out the best strategy to eventually release
>> the memory taken up by the pointers of the unreleased elements, but the
>> worst case scenario is that the unused memory only gets wasted until the
>> time that the list itself gets garbage collected.
>
> FWIW, for a long-running FIFO queue, it's critical to release some of the
> memory along the way, otherwise the amount of wasted memory is unbounded.
>
> Good luck :)

It seems to me that the best way to do this is invert .append() logic:
leave at most X amount of wasted space at the beginning of the list,
where X is a constant fraction of the list size.

Whether it is worth adding a extra pointer to the data stored by a
list is another story.

-Mike
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Summary of 2 years of Python fuzzing

2010-01-25 Thread Jesse Noller

On Mon, Jan 25, 2010 at 5:34 PM, Victor Stinner
 wrote:
> Hi,
>
> I'm running regulary my fuzzer (Fusil) on CPython since summer 2008: I tested
> Python 2.5, 2.6, 2.7, 3.0, 3.1 and 3.2. I'm only looking for "fatal errors":
> Python process killed by a signal, or sometimes fuzzer timeouts. I ignore most
> timeout results because most of them are valid function calls reading
> from/writing to a file or socket. My goal is to improve Python security:
> protect it against malicious data injection and denial of service. I prefer
> fuzzing to static code analyze because it finds few false positives and it
> directly generate a script reproducing the crash. Fuzzing is just one tool
> helping to improve the global security.
>
>
> Bugs found in CPython by Fusil
> ==
>
> Modules
> ---
>
> Fatal errors were only found in modules written in C. Modules: __builtin__
> (5), json (4), io (3), bsddb (3), sqlite3 (3), audioop (2), locale (2),
> cProfile (2), Tkinter (2), dl, struct, binascii, testcapi, cPickle,
> multibytecodec, ctypes, hotshot, bz2, thread, bisect, weakref, imageop,
> multiprocessing.
>
>  __builtin__: Exception, str, unicode, bytearray and long
>  io: BytesIO, StringIO and FileIO
>
> It looks like json, bsddb and sqlite3 are young and not enough tested. audioop
> and imageop bugs are the most critical because they lead to writing to
> uninitialized memory (which might allow to execute arbitrary code).
>
> This module list gives also a first idea of which modules should be
> blacklisted in a sandbox ;-)
>
> Cause
> -
>
> The most common causes are insufficient input validation and invalid/missing
> error handling.
>
> "Insufficient input validation" means that the function is vulnerable to
> malicious data injection. "Invalid error handling"  means that the function
> causes a new error while trying to cleanup data (eg. release memory of an
> uninitialized variable). "Missing error handling" means that a function result
> is an error but the caller doesn't check the function result.
>
> I don't have a generic solution to detect these problems. Except for "missing
> error handling": gcc has an extension to the C language to indidate that the
> result have to be used, __attribute__((warn_unused_result)). The GNU libc uses
> it the avoid common bugs.
>
>
> Consequence
> ---
>
> The most common consequence is to read from/write to uninitialized memory
> (especially reading from a NULL pointer) which lead sometimes to a
> segmentation fault (heisenbugs!). The second most common consequence is an
> unexpected exception during garbage collection: it displays a Fatal Python
> error and quits Python.
>
> I would suggest to log unexpected exception during garbage collection without
> stopping the whole Python process, as done for exceptions in a destructor.
>
> Details
> ---
>
> Full list of all bugs found by Fusil with links to the bugtracker and to the
> commits:
>
>   http://bitbucket.org/haypo/fusil/wiki/Python
>
>
> Interaction with the Python developers
> ==
>
> I open an issue for each bug found in CPython. I describe how to reproduce it
> and try to write a patch. I have learn to always write an unit test, useful to
> reproduce the bug, and it makes Python commiters happy :-)
>
> The reaction depends on the impacted component, the severity of the bug, the
> complexity of the code reproducing the bug, and the quality of my bug report
> :-) The answer was always quick for core components. But some modules are
> maintained by everyone, which means nobody, like imageop, audioop or
> cProfile/hotshot. Having a module maitainer, like Guilherme Polo aka gpolo for
> Tkiner, does really help!
>
> It looks like fuzzing bugs are not always appreciated by developers, maybe
> because they are always "borderline" cases (not "realist").
>
> Sometimes, even if I write a patch, an unit test, explain the problem and the
> solution, I don't get any comment. It doesn't motivate me to continue fuzzing
> :-/
>
>
> Play with Fusil at home
> ===
>
> If you would like to fuzz Python with Fusil: download the last version of
> Fusil and run PYTHON fusil-python as root, where PYTHON is your python
> interpreter. Use --success=50 to wait for 50 crashs before stopping, --fast to
> speed up the fuzzing but slow down your computer, and --only-c to test only
> Python modules written in C.
>
>   http://bitbucket.org/haypo/fusil/wiki/Home
>
> Fusil is running as the user fusil and group fusil to avoid removing arbitrary
> file or killing an arbitrary process, that's why you need to run it as root.
>
> If Fusil found a crash, you can analyze it while Fusil is running. Go into
> python// and read stdout and session.log files. Use "sudo
> ./replay.py --gdb" command to "replay" the crash in gdb (--valgrind option can
> also be useful).
>
> I'm only working on Linux, but Fusil works on any UNIX/BSD OS. Don't use Fusil
> on Windows! It might work

Re: [Python-Dev] Summary of 2 years of Python fuzzing

2010-01-25 Thread Christian Heimes

Victor Stinner wrote:
> I'm running regulary my fuzzer (Fusil) on CPython since summer 2008: I tested 
> Python 2.5, 2.6, 2.7, 3.0, 3.1 and 3.2. I'm only looking for "fatal errors": 
> Python process killed by a signal, or sometimes fuzzer timeouts. I ignore 
> most 
> timeout results because most of them are valid function calls reading 
> from/writing to a file or socket. My goal is to improve Python security: 
> protect it against malicious data injection and denial of service. I prefer 
> fuzzing to static code analyze because it finds few false positives and it 
> directly generate a script reproducing the crash. Fuzzing is just one tool 
> helping to improve the global security.

[CC to Stefan Behnel from the Cython project]

Thank you very much for all the work Victor!

Out of curiosity, can Fusil be used to check 3rd party extension as
well? I'd like to validate some extensions and library bindings I wrote
or that I'm using heavily at work. I'm especially interested in Cython
support: annotating the erroneous line of Cython code and getting the
shared library that causes the error to distinguish between my errors
and problems the wrapped libraries.

Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

--- On Mon, 1/25/10, Mike Klaas  wrote:

> From: Mike Klaas 
> > On Mon, Jan 25, 2010 at 1:22 PM, Steve Howell 
> wrote:
> >>
> >> I haven't completely worked out the best strategy
> to eventually release
> >> the memory taken up by the pointers of the
> unreleased elements, but the
> >> worst case scenario is that the unused memory only
> gets wasted until the
> >> time that the list itself gets garbage collected.
> >
> > FWIW, for a long-running FIFO queue, it's critical to
> release some of the
> > memory along the way, otherwise the amount of wasted
> memory is unbounded.
> >
> > Good luck :)
> 
> It seems to me that the best way to do this is invert
> .append() logic:
> leave at most X amount of wasted space at the beginning of
> the list,
> where X is a constant fraction of the list size.

That's roughly what I intend to do.  The problems are not entirely symmetric.  
For appends, the wasted space exists in the sense that CPython optimistically 
get extra chunks of memory for future appends, to save CPU at the possible cost 
of needlessly allocating memory.

For pops, the bargain would be that you optimistically defer releasing memory 
to save CPU cycles in the hope that memory is not scarce.  Of course, if you 
have just popped an element off the list, then you have just made memory less 
scarce by virtue of removing the list elements themselves.

> Whether it is worth adding a extra pointer to the data
> stored by a
> list is another story.

That is definitely a point of contention.  It would certainly bloat a program 
that had millions of empty lists.  I think in most real-world programs, though, 
the amount of memory taken by PyListObjects will always be greatly exceeded by 
the amount of memory used by the list elements, or even just the pointers to 
the list elements.  It's the difference between the number of elements in a 
list, O(N), and the number of structures that define a list's state, O(1).

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Christian Heimes

Benjamin Peterson wrote:
> Yes, but in either of these cases is there an excellent performance
> improvement to be gained and is it worth the complexity of your
> optimization? I think not.

Me, too.
I already tried to explain Steve that I have used list.pop(0) in very
few cases during my seven years as a professional Python developer.
Since I knew that popping from the beginning of a list is slower than
popping from the end or just leaving the list unmodified I found ways to
alter my algorithms. The few cases left were either not performance
critical or used dequeue instead.

I vote -0.5 on the change unless Guido, Tim or Raymond think that the
size and complication impact is worth the hassle.

Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Summary of 2 years of Python fuzzing

2010-01-25 Thread Victor Stinner

Hi,

Le mardi 26 janvier 2010 00:40:47, Christian Heimes a écrit :
> Victor Stinner wrote:
> > I'm running regulary my fuzzer (Fusil) on CPython since summer 2008: I
> > tested Python 2.5, 2.6, 2.7, 3.0, 3.1 and 3.2. I'm only looking for
> > "fatal errors": Python process killed by a signal, or sometimes fuzzer
> > timeouts. I ignore most timeout results because most of them are valid
> > function calls reading from/writing to a file or socket. My goal is to
> > improve Python security: protect it against malicious data injection and
> > denial of service. I prefer fuzzing to static code analyze because it
> > finds few false positives and it directly generate a script reproducing
> > the crash. Fuzzing is just one tool helping to improve the global
> > security.
> 
> Thank you very much for all the work Victor!

You're welcome :)

> Out of curiosity, can Fusil be used to check 3rd party extension as
> well? I'd like to validate some extensions and library bindings I wrote
> or that I'm using heavily at work.

Yes, fusil-python can fuzz any Python module.

Use "fusil-python --modules=yourmodule". See also the --blacklist option.

-- 
Victor Stinner
http://www.haypocalc.com/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Michael Foord


On 26/01/2010 00:12, Christian Heimes wrote:

Benjamin Peterson wrote:
   

Yes, but in either of these cases is there an excellent performance
improvement to be gained and is it worth the complexity of your
optimization? I think not.
 

Me, too.
I already tried to explain Steve that I have used list.pop(0) in very
few cases during my seven years as a professional Python developer.
Since I knew that popping from the beginning of a list is slower than
popping from the end or just leaving the list unmodified I found ways to
alter my algorithms. The few cases left were either not performance
critical or used dequeue instead.

I vote -0.5 on the change unless Guido, Tim or Raymond think that the
size and complication impact is worth the hassle.
   


How great is the complication? Making list.pop(0) efficient sounds like 
a worthy goal, particularly given that the reason you don't use it is 
because you *know* it is inefficient (so the fact that you don't use it 
isn't evidence that it isn't wanted - merely evidence that you had to 
work around the known inefficiency).


Michael


Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
   



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
your employer, to release me from all obligations and waivers arising from any 
and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, 
clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your 
employer, its partners, licensors, agents and assigns, in perpetuity, without 
prejudice to my ongoing rights and privileges. You further represent that you 
have the authority to release me from any BOGUS AGREEMENTS on behalf of your 
employer.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

--- On Mon, 1/25/10, Benjamin Peterson  wrote:

> From: Benjamin Peterson 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Steve Howell" 
> Cc: [email protected]
> Date: Monday, January 25, 2010, 3:15 PM
> 2010/1/25 Steve Howell :
> >> From: Raymond Hettinger 
> >>
> >> On Jan 25, 2010, at 12:36 PM, Steve Howell wrote:
> >>
> >> >
> >> > Deque does not support all the operations
> that list
> >> does.  It is also roughly twice as slow for
> accessing
> >> elements (I've measured it).
> >>
> >>
> >> ISTM that apps that want to insert or pop from the
> front of
> >> list are also apps that don't care about accessing
> arbitrary
> >> elements in the middle using the position index.
>  When
> >> lists are growing or shrinking from the front, the
> meaning
> >> of the i-th element changes.   So, it doesn't
> >> make sense for an application to track indices of
> objects in
> >> such a list.
> >>
> >>    i = s.find('abc')
> >>    s.pop(0)
> >>    print s[i]    # i no longer
> >> points at 'abc'
> >>
> >
> > I am not going to directly address your point, but I'd
> like to give a examples of code that uses pop(0) from the
> standard library.
> >
> > If you look at the code for
> multiprocessing/connection.py, you will see that
> PipeListener creates _handle_queue as an ordinary Python
> list, and in line 317 it uses pop(0) to pop the first handle
> off the top of the queue.
> >
> > Why does that code not use a deque?  In hindsight, it
> probably should.  But to make the change now, it's not a
> simple matter of fixing just PipeListener, because
> PipeListener passes off _handle_queue to Finalize, which
> also expects a list.
> >
> > In order to understand why Finalize expects a list,
> you need to look at how it uses args, and here is one
> example usage:
> >
> > res = self._callback(*self._args, **self._kwargs)
> >
> > Ok, so now you need to know what self._callback is
> doing, so now you have to trace through all callers of
> Finalize are passing in for their args.
> >
> > So what seems like a trivial matter--switching over a
> list to a deque--actually requires a lot of thinking.
> >
> > It turns out that the callback for PipeListener just
> iterates through the remaining handles and closes them.  So
> a deque would not break that code.
> >
> > If you look at difflib.py, it also does pop(0) in a
> loop.  Why doesn't it use a deque?  Simplicity, maybe?
> >
> > codecs.py also deletes from the top of the list:
> >
> > del self.linebuffer[0]
> 
> Yes, but in either of these cases is there an excellent
> performance
> improvement to be gained and is it worth the complexity of
> your
> optimization? I think not.
> 

I am starting to think that the optimization would be drowned out by the cost 
of processing each line, unless you had some combination of the following:
 
 1) a pretty large list (plausible)
 2) a very inexpensive operation that you were applying to each line (rare)
 3) a really slow memmove implementation (extremely doubtful)

The complexity of the optimization does not phase me for some reason.  If 
ruthless simplicity were the only goal, then I'd also simplify/remove some of 
this code:

/* Bypass realloc() when a previous overallocation is large enough
   to accommodate the newsize.  If the newsize falls lower than half
   the allocated size, then proceed with the realloc() to shrink the list.
*/
if (allocated >= newsize && newsize >= (allocated >> 1)) {
assert(self->ob_item != NULL || newsize == 0);
Py_SIZE(self) = newsize;
return 0;
}

/* This over-allocates proportional to the list size, making room
 * for additional growth.  The over-allocation is mild, but is
 * enough to give linear-time amortized behavior over a long
 * sequence of appends() in the presence of a poorly-performing
 * system realloc().
 * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
 */
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

In my mind, though, the complexity within CPython does not have to leak up to 
the Python level.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Christian Heimes

Michael Foord wrote:
> How great is the complication? Making list.pop(0) efficient sounds like 
> a worthy goal, particularly given that the reason you don't use it is 
> because you *know* it is inefficient (so the fact that you don't use it 
> isn't evidence that it isn't wanted - merely evidence that you had to 
> work around the known inefficiency).

The implementation must be changed in at least four places:

* The PyListObject struct gets an additional pointer that stores a
reference to the head. I would keep the head (element 0) of the list in
**ob_item and the reference to the malloc()ed array in a new pointer
*ob_allocated.

* PyList_New() stores the pointer to the allocated memory in
op->ob_allocated and sets op->ob_item = op->ob_allocated

* listpop() moves the op->ob_item pointer by one  for the special case
of pop(0)

* list_resize() should occasionally compact the free space before the
head with memcpy() if it gets too large.

listinsert() could be optimized for 0 if the list has some free space in
front of the header, too.

I favor this approach over an integer offset because doesn't change the
semantic of ob_item.

Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Michael Foord


On 26/01/2010 00:28, Christian Heimes wrote:

Michael Foord wrote:
   

How great is the complication? Making list.pop(0) efficient sounds like
a worthy goal, particularly given that the reason you don't use it is
because you *know* it is inefficient (so the fact that you don't use it
isn't evidence that it isn't wanted - merely evidence that you had to
work around the known inefficiency).
 

The implementation must be changed in at least four places:

* The PyListObject struct gets an additional pointer that stores a
reference to the head. I would keep the head (element 0) of the list in
**ob_item and the reference to the malloc()ed array in a new pointer
*ob_allocated.

* PyList_New() stores the pointer to the allocated memory in
op->ob_allocated and sets op->ob_item = op->ob_allocated

* listpop() moves the op->ob_item pointer by one  for the special case
of pop(0)

* list_resize() should occasionally compact the free space before the
head with memcpy() if it gets too large.

listinsert() could be optimized for 0 if the list has some free space in
front of the header, too.

I favor this approach over an integer offset because doesn't change the
semantic of ob_item.

Christian
   
Well, on the face of it this doesn't sound like a huge increase in 
complexity. Not that I'm qualified to judge.


Michael

--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
your employer, to release me from all obligations and waivers arising from any 
and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, 
clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your 
employer, its partners, licensors, agents and assigns, in perpetuity, without 
prejudice to my ongoing rights and privileges. You further represent that you 
have the authority to release me from any BOGUS AGREEMENTS on behalf of your 
employer.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Alex Gaynor

On Mon, Jan 25, 2010 at 7:32 PM, Michael Foord
 wrote:
> On 26/01/2010 00:28, Christian Heimes wrote:
>>
>> Michael Foord wrote:
>>
>>>
>>> How great is the complication? Making list.pop(0) efficient sounds like
>>> a worthy goal, particularly given that the reason you don't use it is
>>> because you *know* it is inefficient (so the fact that you don't use it
>>> isn't evidence that it isn't wanted - merely evidence that you had to
>>> work around the known inefficiency).
>>>
>>
>> The implementation must be changed in at least four places:
>>
>> * The PyListObject struct gets an additional pointer that stores a
>> reference to the head. I would keep the head (element 0) of the list in
>> **ob_item and the reference to the malloc()ed array in a new pointer
>> *ob_allocated.
>>
>> * PyList_New() stores the pointer to the allocated memory in
>> op->ob_allocated and sets op->ob_item = op->ob_allocated
>>
>> * listpop() moves the op->ob_item pointer by one  for the special case
>> of pop(0)
>>
>> * list_resize() should occasionally compact the free space before the
>> head with memcpy() if it gets too large.
>>
>> listinsert() could be optimized for 0 if the list has some free space in
>> front of the header, too.
>>
>> I favor this approach over an integer offset because doesn't change the
>> semantic of ob_item.
>>
>> Christian
>>
>
> Well, on the face of it this doesn't sound like a huge increase in
> complexity. Not that I'm qualified to judge.
>
> Michael
>
> --
> http://www.ironpythoninaction.com/
> http://www.voidspace.org.uk/blog
>
> READ CAREFULLY. By accepting and reading this email you agree, on behalf of
> your employer, to release me from all obligations and waivers arising from
> any and all NON-NEGOTIATED agreements, licenses, terms-of-service,
> shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure,
> non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have
> entered into with your employer, its partners, licensors, agents and
> assigns, in perpetuity, without prejudice to my ongoing rights and
> privileges. You further represent that you have the authority to release me
> from any BOGUS AGREEMENTS on behalf of your employer.
>
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/alex.gaynor%40gmail.com
>

Does anyone know if any other language's automatic array (or whatever
they call it) special case the pop(0) case like this?

Alex

-- 
"I disapprove of what you say, but I will defend to the death your
right to say it." -- Voltaire
"The people's good is the highest law." -- Cicero
"Code can always be simpler than you think, but never as simple as you
want" -- Me
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

--- On Mon, 1/25/10, Christian Heimes  wrote:
> From: Christian Heimes 
> Michael Foord wrote:
> > How great is the complication? Making list.pop(0)
> efficient sounds like 
> > a worthy goal, particularly given that the reason you
> don't use it is 
> > because you *know* it is inefficient (so the fact that
> you don't use it 
> > isn't evidence that it isn't wanted - merely evidence
> that you had to 
> > work around the known inefficiency).
> 
> The implementation must be changed in at least four
> places:
> 
> * The PyListObject struct gets an additional pointer that
> stores a
> reference to the head. I would keep the head (element 0) of
> the list in
> **ob_item and the reference to the malloc()ed array in a
> new pointer
> *ob_allocated.
> 
> * PyList_New() stores the pointer to the allocated memory
> in
> op->ob_allocated and sets op->ob_item =
> op->ob_allocated
> 
> * listpop() moves the op->ob_item pointer by one 
> for the special case
> of pop(0)
> 
> * list_resize() should occasionally compact the free space
> before the
> head with memcpy() if it gets too large.
> 
> listinsert() could be optimized for 0 if the list has some
> free space in
> front of the header, too.
> 
> I favor this approach over an integer offset because
> doesn't change the
> semantic of ob_item.
> 

The approach that Christian outlines is exactly what I intend to accomplish, 
even if the patch does get permanently or temporarily rejected.

I am pretty confident about what needs to be done within list_ass_slice, 
including the listinsert() optimization.  I also see where I need to add the 
new pointer (ob_allocated seems like a good name to me) within the PyListObject 
struct.

Still wrestling with the other details, though.  My C is pretty rusty, and of 
course I have the extreme versatility of Python to blame for that! :)




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Collin Winter

Hey Floris,

On Mon, Jan 25, 2010 at 1:25 PM, Floris Bruynooghe
 wrote:
> On Mon, Jan 25, 2010 at 10:14:35AM -0800, Collin Winter wrote:
>> I'm working on a patch to completely remove all traces of C++ with
>> configured with --without-llvm. It's a straightforward change, and
>> should present no difficulties.
>
> Great to hear that, thanks for caring.

This has now been resolved. As of
http://code.google.com/p/unladen-swallow/source/detail?r=1036,
./configure --without-llvm has no dependency on libstdc++:

Before: $ otool -L ./python.exe
./python.exe:
/usr/lib/libSystem.B.dylib
/usr/lib/libstdc++.6.dylib
/usr/lib/libgcc_s.1.dylib


After: $ otool -L ./python.exe
./python.exe:
/usr/lib/libSystem.B.dylib
/usr/lib/libgcc_s.1.dylib

I've explicitly noted this in the PEP (see
http://codereview.appspot.com/186247/diff2/2001:4001/5001).

Thanks,
Collin Winter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Cameron Simpson

On 25Jan2010 12:34, Steve Howell  wrote:
| From: Raymond Hettinger 
| > 1) To many things in the Python world rely on
| > the current implementation of lists.  It's not
| > worth breaking third-party extensions, tools like psyco,
| > work on unladen swallow, and other implementations of Python
| > such as PyPy and Jython.
| 
| I don't understand how changing the implementation of CPython would
| impact PyPy and Jython, unless you are just referring to the fact that
| CPython is treated as a reference implementation, so its simplicity is
| a virtue for other ports.  Am I missing something else?

I can think of something: lists traditionally have O(n) pop(0) performance
because they're normally quite simple layers on top of an array.

(And likewise for pop(1), which your approach won't help; I know you
could measure the list and decide pop(1) is a small copy of the left of
the list along with a move of the base offset, but that degrades as you
move from 0 and 1 to larger indices).

Supposing pop(0) becomes cheap in CPython and this becomes well known
(or worse, documented:-)
Someone depending on this now has code that is fundamentally inefficient
on other Pythons.

I know this is a slightly thin objection, since your change can probably
be taken to the other implementations.
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

But pessimism IS realism!   - D.L.Bahr
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rich Comparison recipe wrong?

2010-01-25 Thread Nick Coghlan

Lennart Regebro wrote:
> On Mon, Jan 25, 2010 at 15:30, Nick Coghlan  wrote:
>> Ah, you mean the case where both classes implement the recipe, but know
>> nothing about each other and hence both return NotImplemented from their
>> root comparison?
> 
> Well, only one needs to return NotImplemented, actually.

I'd like to see a test case that proved that. With two different types
and only one of them returning NotImplemented, the recursion should
terminate inside the one with the root comparison that can handle both
types.

Cheers,
Nick.


-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyCon Keynote

2010-01-25 Thread Nick Coghlan

Ron Adam wrote:
> I'd like to see Python 3+ be more suitable for full distributable
> applications over 2.X and earlier.

Out of curiousity, have you tried experimenting with the zipfile
execution capabilities in 2.6/3.1? A major part of that was to make
multi-file Python applications nearly as easy to execute as single-file
scripts.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Possible changes to handling of default encoding for text files (was Re: Proposed downstream change to site.py in Fedora (sys.defaultencoding))

2010-01-25 Thread Nick Coghlan

Tres Seaver wrote:
>> Perhaps we could also add a warning to the open() API which warns
>> in case a file is opened in text mode without specifying an
>> encoding ?!
> 
> That ounds like a good plan to me, given that backward-compatibility
> requires keeping the guessing enabled by default.

Perhaps a switch along the lines of -t and -tt (warnings/errors for
mixing tabs and spaces)?

So by default Python continues to guess based on the locale encoding
(perhaps with a PYTHONTEXTENCODING to override the locale specifically
for Python).

Then -T might warn whenever a text file encoding is guessed rather than
specified and -TT might raise an error.

Regards,
Nick.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Meador Inge

> We really do need precise descriptions of the problems so we can avoid
them.

Initialization of objects with static storage duration typically get a bad
wrap for two main reasons: (1) each toolchain implements them differently
(but typically by storing initialization thunks in a table that is walked by
the C++RT before entry to 'main'), which may lead to subtle differences when
compiling for different platforms and (2) there is no guaranteed
initialization order across translation unit boundaries.

(1) is just a fact of multi-platform development.  (2) is a bit more
interesting.  Consider two translation units 'a.cpp' and 'b.cpp':

   // a.cpp
   T { T() {} };
   ...
   static T obj1;

   // b.cpp
   S { S() {} };
   ...
   static S obj2;

When 'obj1' and 'obj2' get linked into the final image there are no
guarantees on whose constructor (T::T or S::S) will be called first.
Sometimes folks write code where this initialization order matters.  It may
cause strange behavior at run-time that is hard to pin down.  This may not
be a problem in the LLVM code base, but it is the typical problem that C++
devs run into with initialization of objects with static storage duration.

Also related to reduced code size with C++ I was wondering whether or not
anyone has explored using the ability of some toolchains to remove unused
code and data?  In GCC this can be enabled by compiling with
'-ffunction-sections' and '-fdata-sections' and linking with
'--gc-sections'.  In MS VC++ you can compile with '/Gy' and link with
'/OPT'.  This feature can lead to size reductions sometimes with C++ due to
things like template instantation causing multiple copies of the same
function to be linked in.  I played around with compiling CPython with this
(gcc + Darwin) and saw about a 200K size drop.  I want to try compiling all
of U-S (e.g. including LLVM) with these options next.

Thanks,

-- Meador
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Nick Coghlan

Jeffrey Yasskin wrote:
> (b) could be a problem if we depend on LLVM as a shared library on one
> of these platforms (and, of course, if LLVM's JIT supports these
> systems at all). The obvious answers are: 1) --without-llvm on these
> systems, 2) link statically on these systems, 3) eliminate the static
> constructors. There may also be less obvious answers.

Could the necessary initialisation be delayed until the Py_Initialize()
call? (although I guess that is just a particular implementation
strategy for option 3).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Nick Coghlan

Meador Inge wrote:
> When 'obj1' and 'obj2' get linked into the final image there are no
> guarantees on whose constructor (T::T or S::S) will be called first. 
> Sometimes folks write code where this initialization order matters.  It
> may cause strange behavior at run-time that is hard to pin down.  This
> may not be a problem in the LLVM code base, but it is the typical
> problem that C++ devs run into with initialization of objects with
> static storage duration.

Avoiding this problem is actually one of the original reasons for the
popularity of the singleton design pattern in C++. With instantiation on
first use, it helps ensure the constructors are all executed in the
right order. (There are other problems with the double-checked locking
required under certain aggressive compiler optimisation strategies, but
the static initialisation ordering problem occurs even when optimisation
is completely disabled).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Nick Coghlan

Michael Foord wrote:
> On 26/01/2010 00:28, Christian Heimes wrote:
>> I favor this approach over an integer offset because doesn't change the
>> semantic of ob_item.
>>
> Well, on the face of it this doesn't sound like a huge increase in
> complexity. Not that I'm qualified to judge.

Christian's approach is a good one for minimising the semantic changes,
and compared to an offset based approach actually has a decent chance of
working without breaking too much C code (you'd have to change the way
sys.getsizeof() worked for lists, but I can't think of anything else off
the top of my head that would definitely break).

The potential of resizing and hence relocation of the storage buffer
means it was already unsafe to cache ob_item when changing the size of
the list, so code should generally be unaware that ob_item can now
change even when the buffer isn't reallocated.

However, as Cameron pointed out, the O() value for an operation is an
important characteristic of containers, and having people get used to an
O(1) list.pop(0) in CPython could create problems not only for other
current Python implementations but also for future versions of CPython
itself.

This idea changes list from a simple concept (ob_item points to the
beginning of an allocated array which may change length) to a more
complicated deque-like one (ob_item points somewhere near the beginning
of an allocated array which may change length).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Summary of 2 years of Python fuzzing

2010-01-25 Thread skip


Victor> Fuzzing is just one tool helping to improve the global security.

Victor,

Thank you, thank you, thank you.

At my day job I work on automated trading systems.  One key component of
such tools is the safeguard subsystem which places limits on various parts
of the system, the rates at which certain operations can happen or
thresholds on certain value.  Stuff like:

* don't allow a position of more than N shares of equity ABC

* don't allow more than P orders to be created in Q seconds

The common wisdom within our group is that safeguards are never fully
appreciated by the users of the system.  Safeguards are not there to help
you make more money.  Quite the contrary.  They are often viewed as a
distraction from the prime objective: trade and make money.  They are there
to keep you from losing gobs of money, often in situations where you failed
to anticipate some market anomaly in your new trading model.

With that in mind I think of Fusil as one component of a safeguard system
for Python.  Fusil helps identify certain classes of anomalies in inputs to
Python programs.  Hopefully I will never encounter any of the corner cases
you've identified with it, but if I ever do it may well save my butt.

Skip
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Reid Kleckner

On Mon, Jan 25, 2010 at 9:05 PM, Meador Inge  wrote:
> Also related to reduced code size with C++ I was wondering whether or not
> anyone has explored using the ability of some toolchains to remove unused
> code and data?  In GCC this can be enabled by compiling with
> '-ffunction-sections' and '-fdata-sections' and linking with
> '--gc-sections'.  In MS VC++ you can compile with '/Gy' and link with
> '/OPT'.  This feature can lead to size reductions sometimes with C++ due to
> things like template instantation causing multiple copies of the same
> function to be linked in.  I played around with compiling CPython with this
> (gcc + Darwin) and saw about a 200K size drop.  I want to try compiling all
> of U-S (e.g. including LLVM) with these options next.

I'm sure someone has looked at this before, but I was also considering
this the other day.  One catch is that C extension modules need to be
able to link against any symbol declared with the PyAPI_* macros, so
you're not allowed to delete PyAPI_DATA globals or any code reachable
from a PyAPI_FUNC.

Someone would need to modify the PyAPI_* macros to include something
like __attribute__((used)) with GCC and then tell the linker to strip
unreachable code.  Apple calls it "dead stripping":
http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPages/man1/ld.1.html

This seems to have a section on how to achieve the same effect with a
gnu toolchain:
http://utilitybase.com/article/show/2007/04/09/225/Size+does+matter:+Optimizing+with+size+in+mind+with+GCC

I would guess that we have a fair amount of unused LLVM code linked in
to unladen, so stripping it would reduce our size.  However, we can
only do that if we link LLVM statically.  If/When we dynamically link
against LLVM, we lose our ability to strip out unused symbols.  The
best we can do is only link with the libraries we use, which is what
we already do.

Reid
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

--- On Mon, 1/25/10, Nick Coghlan  wrote:

> From: Nick Coghlan 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Michael Foord" 
> Cc: "Christian Heimes" , [email protected]
> Date: Monday, January 25, 2010, 6:32 PM
> Michael Foord wrote:
> > On 26/01/2010 00:28, Christian Heimes wrote:
> >> I favor this approach over an integer offset
> because doesn't change the
> >> semantic of ob_item.
> >>    
> > Well, on the face of it this doesn't sound like a huge
> increase in
> > complexity. Not that I'm qualified to judge.
> 
> Christian's approach is a good one for minimising the
> semantic changes,
> and compared to an offset based approach actually has a
> decent chance of
> working without breaking too much C code (you'd have to
> change the way
> sys.getsizeof() worked for lists, but I can't think of
> anything else off
> the top of my head that would definitely break).

As I'm diving into this, it is clear that you want to preserve the semantics of 
ob_item and ob_size, as they are used in a whole bunch of places.  For now I am 
tracking a var called orphans, which subtly changes one invariant:

  0 <= ob_size + orphans <= allocated

I think Christian covered most of the places that would need to change, and 
list_dealloc would also need to change.

> However, as Cameron pointed out, the O() value for an
> operation is an
> important characteristic of containers, and having people
> get used to an
> O(1) list.pop(0) in CPython could create problems not only
> for other
> current Python implementations but also for future versions
> of CPython
> itself.

I hadn't thought of that.

Here are the objections that I've heard or thought of myself:

 * The simplicity of the current implementation is important beyond the normal 
benefits of simplicity, since it is also a reference implementation for other 
ports of Python.
 * People who got used to O(1) in one version of Python might have unpleasant 
surprises when they went to other versions.
 * Alternatives to list already exist, such as deque and blist 
 * An O(1) solution would increase the size of PyListObject.
 * An O(1) solution would postpone the release of the memory from the orphaned 
pointers.
 * An O(1) solution would slow down calls to list_resize, PyList_new, and 
list_dealloc.
 * For small and medium sized lists, memmove()'s penalty is usually drowned out 
by other operations on the list elements.
 * The use case of popping elements off a large list is not that common 
(although this might be somewhat driven by the documented slowness)
 * There may be third party code that relies on the current internal 
implementation

Did I leave anything out?

Here are the benefits of an O(1) implementation.

 * O(1) is faster than O(N) for some, presumably quite small, value of N.
 * Performance benefits tend to compound.  If you have P processes doing pop(0) 
in a loop on an N-element list, you are saving P * N memmoves of size kN.
 * The technique required to make O(1) work is simple at its core--advance a 
pointer forward instead of moving the memory backward.
 * Encouraging the use of pop(0) will lead to leaner Python programs that 
release memory earlier in the process.
 * While not super common, there do exist programs today that pop from the top, 
either using pop itself or del, including programs in the standard library
 * The language moratorium provides a good window for performance improvements 
in general (even if this one does not pass the litmus test for other reasons)

Did I leave anything out?

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyCon Keynote

2010-01-25 Thread David Lyon

Nick Coghlan:

>> I'd like to see Python 3+ be more suitable for full distributable
>> applications over 2.X and earlier.
>
> Out of curiousity, have you tried experimenting with the zipfile
> execution capabilities in 2.6/3.1? A major part of that was to make
> multi-file Python applications nearly as easy to execute as single-file
> scripts.

With all due respect, that process is a bit like a black magic
approach. Maybe the capability is there, but it isn't very well
documented and it isn't obvious.

I doubt it would work on all package types (there are many) and
not everybody is using 2.6/3.1 for everything yet.

If you want, you could tell me exactly how it is actually done
and I can do a test on the packages on pypi and tell you exactly
how many that it will work for and how many it wont.

I think what Ron meant was "a way for normal users to install
a python application". That capability doesn't exist in standard
python in a way that is compatable or similar to anything that
has happened in the "outside" world for the last decade.

In a scientific application, it might be "I need a program
to control my .(something).ascope". It's quite simple. That
capability doesn't exist yet and discussion about it gets
ignored every time the question is asked on the distutils-mailing
list.

Maybe Distribute addresses all these issues and we just don't
know about it. As regular uses, we were told to expect that and
that Distribute could address our issues. Distribute has arrived
as promised so maybe we should be checking there.

In any case, the number of people that can actually work on
packaging problems is 2 up to a maximum of 3. It's too small
to compete realistically with other languages such as perl
which might have 5 to 10 people working there.

Guido has asked once or twice about CPAN like capabilities
for Python.

People on the list answered, but the core development team
didn't answer on list. (They could have answered off list -
hard for us to know).

So major peaces of infastructure remain left out. Like a
package testbot for example. Or Unit Testing of Packages.

Martin can't be expected to do that and neither can Tarek.

In some open-source projects, money or resources would
be channeled to contractors to get boring bits done. It's
not like a nuclear research labority or a medical institution
is poor. Not anywhere I know anyway. It is always surprising
how when somebody important asks somebody else important
just how easily things can happen.

I don't know the exact prices.. but I'm pretty sure that if
just one old electron microscope could be given to charity
(us) for us to sell on ebay.. (just an idea) it would be
enough to pay for a team for a year - haha

So who's got an old electron microscope to swap for a package
installation system ? :-)

Without that can-do attitude, all our present resources are
all tied up. There's little chance that any of Guido's issues
can be addressed. That's just my take on it anyway.

Yes.. and finishing with your suggestions... we can
experiment.. play with all the undocumented and unobvious
features. And really, in a sense you are right. That is
all we can do.

David








___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

I made enough of a patch to at least get a preliminary benchmark.

The program toward the bottom of this email runs over 100 times faster with my 
patch.  The patch still has a ways to go--I use a very primitive scheme to 
reclaim orphan pointers (1000 at a time) and I am still segfaulting when 
removing the last element of the list.  But the initial results at least 
confirm that the intended benefit is achievable.

I've attached the diff, in case anyone wants to try it out or help me figure 
out what else needs to change.

The core piece of the patch is this--everything else is memory management 
related.

+if (ilow == 0) {
+a->orphans += 1;
+a->ob_item += (-1 * d);
+}
+else {
+memmove(&item[ihigh+d], &item[ihigh],
+(Py_SIZE(a) - ihigh)*sizeof(PyObject *));
+}


import time

n = 8

lst = []
for i in range(n):
lst.append(i)

t = time.time()
for i in range(n-1):
del lst[0]

print('time = ' + str(time.time() - t))
print(len(lst))
print('got here at least')


show...@showell-laptop:~/PYTHON/py3k$ cat BEFORE 
0
2.52699589729

show...@showell-laptop:~/PYTHON/py3k$ cat AFTER 
time = 0.0216660499573
1
got here at least


Python 3.2a0 (py3k:77751M, Jan 25 2010, 20:25:21) 
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 2.526996 / 0.021666
116.63417335918028
>>>

DIFF
Description: Binary data
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rich Comparison recipe wrong?

2010-01-25 Thread Lennart Regebro

On Tue, Jan 26, 2010 at 02:56, Nick Coghlan  wrote:
> Lennart Regebro wrote:
>> On Mon, Jan 25, 2010 at 15:30, Nick Coghlan  wrote:
>>> Ah, you mean the case where both classes implement the recipe, but know
>>> nothing about each other and hence both return NotImplemented from their
>>> root comparison?
>>
>> Well, only one needs to return NotImplemented, actually.
>
> I'd like to see a test case that proved that. With two different types
> and only one of them returning NotImplemented, the recursion should
> terminate inside the one with the root comparison that can handle both
> types.

It never gets to that root comparison, as several of that types
comparisons just refer back to the type who returns NotImplemented. To
solve it you need to never refer to the other type in your
comparisons. And as far as I see that makes it impossible to implement
full comparisons with only one operator implemented.

In short, the recipe assumes you never compare with other types, but
it doesn't check for it.

Test attached (assuming this list accepts attachements).

-- 
Lennart Regebro: Python, Zope, Plone, Grok
http://regebro.wordpress.com/
+33 661 58 14 64
# By Christian Muirhead, Menno Smits and Michael Foord 2008
# WTF license
# http://voidspace.org.uk/blog

"""
``total_ordering`` and ``force_total_ordering`` are class decorators for 
Python 2.6 & Python 3.

They provides *all* the rich comparison methods on a class by defining *any*
one of '__lt__', '__gt__', '__le__', '__ge__'.

``total_ordering`` fills in all unimplemented rich comparison methods, assuming
at least one is implemented. ``__lt__`` is taken as the base comparison method
on which the others are built, but if that is not available it will be
constructed from the first one found.

``force_total_ordering`` does the same, but having taken a comparison method as
the base it fills in *all* the others - this overwrites additional comparison
methods that may be implemented, guaranteeing consistent comparison semantics.

::

from total_ordering import total_ordering

@total_ordering
class Something(object):
def __init__(self, value):
self.value = value
def __lt__(self, other):
return self.value < other.value

It also works with Python 2.5, but you need to do the wrapping yourself:

::

from total_ordering import total_ordering

class Something(object):
def __init__(self, value):
self.value = value
def __lt__(self, other):
return self.value < other.value

total_ordering(Something)

It would be easy to modify for it to work as a class decorator for Python
3.X and a metaclass for Python 2.X.
"""


import sys as _sys

if _sys.version_info[0] == 3:
def _has_method(cls, name):
for B in cls.__mro__:
if B is object:
continue
if name in B.__dict__:
return True
return False
else:
def _has_method(cls, name):
for B in cls.mro():
if B is object:
continue
if name in B.__dict__:
return True
return False



def _ordering(cls, overwrite):
def setter(name, value):
if overwrite or not _has_method(cls, name):
value.__name__ = name
setattr(cls, name, value)

comparison = None
if not _has_method(cls, '__lt__'):
for name in 'gt le ge'.split():
if not _has_method(cls, '__' + name + '__'):
continue
comparison = getattr(cls, '__' + name + '__')
if name.endswith('e'):
eq = lambda s, o: comparison(s, o) and comparison(o, s)
else:
eq = lambda s, o: not comparison(s, o) and not comparison(o, s)
ne = lambda s, o: not eq(s, o)
if name.startswith('l'):
setter('__lt__', lambda s, o: comparison(s, o) and ne(s, o))
else:
setter('__lt__', lambda s, o: comparison(o, s) and ne(s, o))
break
assert comparison is not None, 'must have at least one of ge, gt, le, lt'

setter('__ne__', lambda s, o: s < o or o < s)
setter('__eq__', lambda s, o: not s != o)
setter('__gt__', lambda s, o: o < s)
setter('__ge__', lambda s, o: not (s < o))
setter('__le__', lambda s, o: not (s > o))
return cls


def total_ordering(cls):
return _ordering(cls, False)

def force_total_ordering(cls):
return _ordering(cls, True)

if __name__ == '__main__':
@total_ordering
class DuckTyper(object):
def __init__(self, v):
self.v = v

def __lt__(self, other):
return (self.v > other.v) - (self.v < other.v)

@total_ordering
class StrictComparator(object):
def __init__(self, v):
self.v = v

def __lt__(self, other):
if not isinstance(other, StrictComparator):
return NotImp

Re: [Python-Dev] PyCon Keynote

2010-01-25 Thread P.J. Eby


At 03:15 PM 1/26/2010 +1100, David Lyon wrote:

With all due respect, that process is a bit like a black magic
approach. Maybe the capability is there, but it isn't very well
documented and it isn't obvious.


I don't see what's so hard about:

1. Zip up your application in myapp.zip with a __main__.py
2. Run "python myapp.zip"

For development, you don't even need the zipfile.  Just do:

1. Put your app in myapp/ with a __main__.py
2. Run "python myapp"


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyCon Keynote

2010-01-25 Thread David Lyon

> At 03:15 PM 1/26/2010 +1100, David Lyon wrote:
>>With all due respect, that process is a bit like a black magic
>>approach. Maybe the capability is there, but it isn't very well
>>documented and it isn't obvious.
>
> I don't see what's so hard about:
>
> 1. Zip up your application in myapp.zip with a __main__.py
> 2. Run "python myapp.zip"
>
> For development, you don't even need the zipfile.  Just do:
>
> 1. Put your app in myapp/ with a __main__.py
> 2. Run "python myapp"
>

Firstly, it doesn't create create desktop shortcuts - sorry users
need those. Where do the programs go?

Secondly, I never knew about it. Is this a distutils option?

I'm sure you know that providing a zip file to a normal user
as an applicaton and you get back the immediate question.. "So,
what do I do with this?". Then as a sysadmin, you have to go
install it manually in their "Program Files" and setup all the
shortcuts so that they can actually use it.

It just isn't up to modern standards. Even with my two years
experience in python I had no knowledge of this trick.

Where is the distutils support to allow this to be built?

Hey look, if this capability is in there, then great.

But having a __main__.py file in a zip file is hardly a clear
and obvious way (to outside people) that it is a python
application.

Why can't we just be like the rest of the universe and have
one icon type for packages and one icon type for applications.

Double click them and they get filed in the right place.

This is what I am talking about, python packaging has become
just so obscure over the years.

Having two ways to do something (one way to package an egg
and one way to package an application) isn't true to python
spirit of having just one way, and one obvious way to do
something.

I'm very sure you know what I mean here.. and this is coming
from a big fan of yours.

David

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Martin v. Löwis

Nick Coghlan wrote:
> Jeffrey Yasskin wrote:
>> (b) could be a problem if we depend on LLVM as a shared library on one
>> of these platforms (and, of course, if LLVM's JIT supports these
>> systems at all). The obvious answers are: 1) --without-llvm on these
>> systems, 2) link statically on these systems, 3) eliminate the static
>> constructors. There may also be less obvious answers.
> 
> Could the necessary initialisation be delayed until the Py_Initialize()
> call? (although I guess that is just a particular implementation
> strategy for option 3).

Exactly. The convenience of constructors is that you don't need to know
what all your global objects. If you arrange it so that you can trigger
initialization, you must have already eliminated them.

The question now is what state they carry: if some of them act as
dynamic containers, e.g. for machine code, it would surely be desirable
to have them constructed with Py_Initialize and destroyed with
Py_Finalize (so that the user sees the memory being released). If they
are merely global flags of some kind, they don't matter much.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyCon Keynote

2010-01-25 Thread Lennart Regebro

On Tue, Jan 26, 2010 at 06:27, David Lyon  wrote:
> Secondly, I never knew about it.

Why did you say the process was like black magic when you didn't know about it?

Is this a distutils option?

No, it's new in Python 2.6, which Nick Coghlan clearly stated in the
text you quoted before saying it was black magic. Maybe you should
have read his mail before you answered it?

-- 
Lennart Regebro: Python, Zope, Plone, Grok
http://regebro.wordpress.com/
+33 661 58 14 64
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyCon Keynote

2010-01-25 Thread David Lyon

> On Tue, Jan 26, 2010 at 06:27, David Lyon 
> wrote:
>> Secondly, I never knew about it.
>
> Why did you say the process was like black magic when you didn't know
> about it?
>
> Is this a distutils option?
>
> No, it's new in Python 2.6, which Nick Coghlan clearly stated in the
> text you quoted before saying it was black magic. Maybe you should
> have read his mail before you answered it?

I read the email... ok let me strike the word 'black'

That leaves us with python "magic" which is probably what I meant.

In any case, the fact that 'applications' can be run this way is
good.

What I would like to see next is this existing work taken to the
next level in Python-3 so that normal users can appreciate it
better. ie:

 - have ".egg" specify a Python 3 package. No other choices

 - have ".eag" specify a Python 3 application. No other choices.

So what we will have in Python-3 is a convergence of all the
good ideas so far into something that is much simpler for
users to use.

Then if an ".eag" file appears in a python 3 system, the user
gets a dialog box "Run this Application ? or Install it?"

Thanks Len. There's been a lot of good work. Let's make it
less obscure and more obvious in Python-3.

Anyway, it's up to Guido and the team that is.

Take care

David

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Terry Reedy


On 1/25/2010 9:32 PM, Nick Coghlan wrote:


However, as Cameron pointed out, the O() value for an operation is an
important characteristic of containers, and having people get used to an
O(1) list.pop(0) in CPython could create problems not only for other
current Python implementations but also for future versions of CPython
itself.


The idea that CPython should not be improved because it would spoil 
programmers strikes me as a thin, even desparate objection. One could 
say that same thing about the recent optimization of string += string so 
that repeated concats are O(n) instead of O(n*n). What a trap if people 
move code to other implementations (or older Python) without that new 
feature.


Of course, the whole purpose of adding a jit to CPython would be to 
spoil us.


It is a fact already that optimizing CPython code is specific to a 
particular interpreter, system, and workload.


Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyCon Keynote

2010-01-25 Thread Vinay Sajip

Barry Warsaw  python.org> writes:

> 
> On Jan 22, 2010, at 10:06 AM, Chris McDonough wrote:
> 
> >Can you tell us where Uncle Timmy has been and when he'll be back?
> 
> He's given up bags of ham for walls of chocolate.
> 

In the Mountain View Chocolate Factory?

Regards,

Vinay Sajip


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell

--- On Mon, 1/25/10, Steve Howell  wrote:

> From: Steve Howell 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Michael Foord" , "Nick Coghlan" 
> 
> Cc: "Christian Heimes" , [email protected]
> Date: Monday, January 25, 2010, 8:33 PM
> I made enough of a patch to at least
> get a preliminary benchmark.
> 
> The program toward the bottom of this email runs over 100
> times faster with my patch.  The patch still has a ways
> to go--I use a very primitive scheme to reclaim orphan
> pointers (1000 at a time) and I am still segfaulting when
> removing the last element of the list.  But the initial
> results at least confirm that the intended benefit is
> achievable.
> 

Ok, I fixed the obvious segfaults, and I am now able to pass all the tests on 
my debug build.  A new diff is attached.

There is still at least one bug in my code in listextend, which the tests do 
not seem to expose, so I will try to at least beef up the test suite a bit.

I really like listobject.c.  Very clean code, very easy to understand.  I guess 
I shouldn't be surprised.





DIFF
Description: Binary data
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Summary of 2 years of Python fuzzing

2010-01-25 Thread Terry Reedy


On 1/25/2010 5:34 PM, Victor Stinner wrote:


It looks like fuzzing bugs are not always appreciated by developers, maybe
because they are always "borderline" cases (not "realist").


People grumble, sometimes, even when quietly appreciative.


Sometimes, even if I write a patch, an unit test, explain the problem and the
solution, I don't get any comment. It doesn't motivate me to continue fuzzing
:-/


According to the link you give
http://bitbucket.org/haypo/fusil/wiki/Python
the current score is 35 closed with commits and 5 open (and no 
rejections). This is a pretty good record )-;

maybe your post will push a few of the rest along.

And yeah, you have my thanks too.

Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Summary of 2 years of Python fuzzing

2010-01-25 Thread Terry Reedy


On 1/26/2010 2:27 AM, Terry Reedy wrote:

On 1/25/2010 5:34 PM, Victor Stinner wrote:


It looks like fuzzing bugs are not always appreciated by developers,
maybe
because they are always "borderline" cases (not "realist").


People grumble, sometimes, even when quietly appreciative.


Sometimes, even if I write a patch, an unit test, explain the problem
and the
solution, I don't get any comment. It doesn't motivate me to continue
fuzzing
:-/


According to the link you give
http://bitbucket.org/haypo/fusil/wiki/Python
the current score is 35 closed with commits and 5 open (and no


Whoops, I now see one 'invalid' in the middle of the commits.
Still pretty good '-).


rejections). This is a pretty good record )-;
maybe your post will push a few of the rest along.

And yeah, you have my thanks too.

Terry Jan Reedy




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

72 matches

Mail list logo