[Python-Dev] Status of 2.7b1?

2010-04-10 Thread Nick Coghlan
The trunk's been frozen for a few days now, which is starting to cut
into the window for new fixes between b1 and b2 (down to just under 3
weeks, and that's only if b1 was ready for release today).

Is work in train to address or document the remaining buildbot failures
(e.g. test_os on Windows [1]). At what point do we decide to document
something as a known defect in the beta and release it anyway?

(My question is mostly aimed at Benjamin for obvious reasons, but it
would be good to hear from anyone that is actually looking into the
Windows buildbot failure)

Cheers,
Nick.

[1]
http://www.python.org/dev/buildbot/trunk/builders/x86%20XP-4%20trunk/builds/3380/steps/test/logs/stdio

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147

2010-04-10 Thread Barry Warsaw
On Apr 09, 2010, at 11:05 PM, Antoine Pitrou wrote:

>« Instead, this PEP proposes to add a mapping between internal magic numbers 
>and a user-friendly tag. Newer versions of Python can add to this mapping so 
>that all later Pythons know the mapping between tags and magic numbers. »
>
>The question is: why do we have to keep a mapping of past tags and magic
>numbers? Don't we only care about our current tag and magic number?
>(similarly, we don't know, and need to know, about Jython's or Pypy's
>stuff...).
>
>As far as I can tell, it would remove the burden of maintening an ever-growing
>registry of past magic numbers and tags.

If you look at the comment near the top of import.c, we kind of do anyway, we
just don't make it available to Python. ;)

I don't have strong feelings about this.  I thought it would be handy for
future Python's to have access to this, but then, without access to previous
version magic numbers, it probably doesn't help much.  And as you say, CPython
won't know about alternative implementation's tags.

So I'm willing to call YAGNI on it and just expose the current Python's magic
tag.  While we're at it, how about making both the tag and the number
attributes of the imp module, instead of functions like .get_magic()?  Of
course we'd keep the latter for backward compatibility.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147, __cached__, and PyImport_ExecCodeModuleEx()

2010-04-10 Thread Barry Warsaw
On Apr 09, 2010, at 05:41 PM, Guido van Rossum wrote:

>On Fri, Apr 9, 2010 at 3:54 PM, Paul Moore  wrote:
>> Would it be better to name this one _PyImport_ExecCodeModuleExEx (with
>> an underscore) so that we *don't* need to create an ExExEx version in
>> future? (Sorry, Barry :-))
>
>I don't care about what name you pick, and my ExEx proposal was meant
>to include half a wink, but http://docs.python.org/c-api/import.html
>makes it clear that PyImport_ExecCodeModuleEx() is far from private!
>(I don't know where Barry got that idea.)

Note that it's the non-Ex version that's documented here.  AFAICT,
PyImport_ExecCodeModuleEx() is not documented.  I'm happy to fix that in my
branch as well.

>While Google Code Search
>finds mostly references to PyImport_ExecCodeModuleEx in the Python
>source code and various copies of it, it also shows some real uses,
>e.g.
>http://www.google.com/codesearch/p?hl=en#bkFK9YpaWlI/ubuntu/pool/universe/y/yehia/yehia_0.5.4.orig.tar.gz|PZ0_Xf7QzC0/yehia-0.5.4.orig/plugins/python/python-loader.cc&q=PyImport_ExecCodeModuleEx

Sure, let's not break existing API even if it's undocumented.  The one nice
thing about ExEx() is that it's clearly related to the two previous API
functions its based on.  But if you don't like it then how about something
like PyImport_ExecCodeModuleWithPathnames()?

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of 2.7b1?

2010-04-10 Thread Brian Curtin
On Sat, Apr 10, 2010 at 10:51, Nick Coghlan  wrote:

> The trunk's been frozen for a few days now, which is starting to cut
> into the window for new fixes between b1 and b2 (down to just under 3
> weeks, and that's only if b1 was ready for release today).
>
> Is work in train to address or document the remaining buildbot failures
> (e.g. test_os on Windows [1]). At what point do we decide to document
> something as a known defect in the beta and release it anyway?
>
> (My question is mostly aimed at Benjamin for obvious reasons, but it
> would be good to hear from anyone that is actually looking into the
> Windows buildbot failure)
>
> Cheers,
> Nick.
>
> [1]
>
> http://www.python.org/dev/buildbot/trunk/builders/x86%20XP-4%20trunk/builds/3380/steps/test/logs/stdio
>
>
I contacted David Bolen for some details about the his buildbot because I
can't reproduce the failure on any Windows XP, Server 2003, or 7 box that I
have, and it's also not a problem on the other XP buildbot. He's traveling
at the moment but will try to get me access to the box after the weekend is
over.

When manually running test_os how buildbot runs it (via test.bat, which runs
rt.bat) he sees the failure. When running the test on a clean checkout
outside of how buildbot does anything, he doesn't see the failure. I'll try
to get access to figure out what the difference is.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of 2.7b1?

2010-04-10 Thread Antoine Pitrou
Nick Coghlan  gmail.com> writes:
> 
> Is work in train to address or document the remaining buildbot failures
> (e.g. test_os on Windows [1]). At what point do we decide to document
> something as a known defect in the beta and release it anyway?

I'm not handling the test_os issue (which I think is in Brian Curtin's hands),
but as far as test_ftplib is concerned (behaviour change with newest OpenSSL
versions), it was decided to address the issue after the beta.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147, __cached__, and PyImport_ExecCodeModuleEx()

2010-04-10 Thread Guido van Rossum
On Sat, Apr 10, 2010 at 8:58 AM, Barry Warsaw  wrote:
> On Apr 09, 2010, at 05:41 PM, Guido van Rossum wrote:
>
>>On Fri, Apr 9, 2010 at 3:54 PM, Paul Moore  wrote:
>>> Would it be better to name this one _PyImport_ExecCodeModuleExEx (with
>>> an underscore) so that we *don't* need to create an ExExEx version in
>>> future? (Sorry, Barry :-))
>>
>>I don't care about what name you pick, and my ExEx proposal was meant
>>to include half a wink, but http://docs.python.org/c-api/import.html
>>makes it clear that PyImport_ExecCodeModuleEx() is far from private!
>>(I don't know where Barry got that idea.)
>
> Note that it's the non-Ex version that's documented here.  AFAICT,
> PyImport_ExecCodeModuleEx() is not documented.  I'm happy to fix that in my
> branch as well.

Ah, true. And yes, please.

>>While Google Code Search
>>finds mostly references to PyImport_ExecCodeModuleEx in the Python
>>source code and various copies of it, it also shows some real uses,
>>e.g.
>>http://www.google.com/codesearch/p?hl=en#bkFK9YpaWlI/ubuntu/pool/universe/y/yehia/yehia_0.5.4.orig.tar.gz|PZ0_Xf7QzC0/yehia-0.5.4.orig/plugins/python/python-loader.cc&q=PyImport_ExecCodeModuleEx
>
> Sure, let's not break existing API even if it's undocumented.  The one nice
> thing about ExEx() is that it's clearly related to the two previous API
> functions its based on.  But if you don't like it then how about something
> like PyImport_ExecCodeModuleWithPathnames()?

Sure.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of 2.7b1?

2010-04-10 Thread Benjamin Peterson
2010/4/10 Nick Coghlan :
> The trunk's been frozen for a few days now, which is starting to cut
> into the window for new fixes between b1 and b2 (down to just under 3
> weeks, and that's only if b1 was ready for release today).
>
> Is work in train to address or document the remaining buildbot failures
> (e.g. test_os on Windows [1]). At what point do we decide to document
> something as a known defect in the beta and release it anyway?

I'm going to do it now (today). We've deffered basically all the
failures until after the beta, so I think I'll just release now.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147

2010-04-10 Thread Nick Coghlan
Barry Warsaw wrote:
> I don't have strong feelings about this.  I thought it would be handy for
> future Python's to have access to this, but then, without access to previous
> version magic numbers, it probably doesn't help much.  And as you say, CPython
> won't know about alternative implementation's tags.
> 
> So I'm willing to call YAGNI on it and just expose the current Python's magic
> tag.  While we're at it, how about making both the tag and the number
> attributes of the imp module, instead of functions like .get_magic()?  Of
> course we'd keep the latter for backward compatibility.

I think one of the virtues of the functions is making it bleedingly
obvious to all concerned that these are read only values.

So +1 to only exposing the current version of the implementation tag and
magic number, and +0 to doing so via attributes rather than functions.

(I'm still in favour of keeping the list of old tags and magic numbers
in a source comment though - commenting them out rather than deleting
them when updating them isn't a major hassle).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Tuning Python dicts

2010-04-10 Thread Reid Kleckner
Hey folks,

I was looking at tuning Python dicts for a data structures class final
project.  I've looked through Object/dictnotes.txt, and obviously
there's already a large body of work on this topic.  My idea was to
alter dict collision resolution as described in the hopscotch hashing
paper[1].  I think the PDF I have came from behind a pay-wall, so I
can't find a link to the original paper.

[1] http://en.wikipedia.org/wiki/Hopscotch_hashing

Just to be clear, this is an experiment I'm doing for a class.  If it
is successful, which I think is unlikely since Python dicts are
already well-tuned, I might consider trying to contribute it back to
CPython over the summer.

The basic idea of hopscotch hashing is to use linear probing with a
cutoff (H), but instead of rehashing when the probe fails, find the
next empty space in the table and move it into the neighborhood of the
original hash index.  This means you have to spend potentially a lot
of extra time during insertion, but it makes lookups very fast because
H is usually chosen such that the entire probe spans at most two cache
lines.  This is much better than the current random (what's the right
name for the current approach?) probing solution, which does
potentially a handful of random accesses into the table.

Looking at dictnotes.txt, I can see that people have experimented with
taking advantage of cache locality.  I was wondering what benchmarks
were used to glean these lessons before I write my own.  Python
obviously has very particular workloads that need to be modeled
appropriately, such as namespaces and **kwargs dicts.

Any other advice would also be helpful.

Thanks,
Reid


One caveat I need to work out:  If more than H items collide into a
single bucket, then you need to rehash.  However, if you have a
particularly evil hash function which always returns zero, no matter
how much you rehash, you will never be able to fit all the items into
the first H buckets.  This would cause an infinite loop, while I
believe the current solution will simply have terrible performance.
IMO the solution is just to increase H for the table if the rehash
fails, but realistically, this will never happen unless the programmer
is being evil.  I'd probably skip this detail for the class
implementation.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147, __cached__, and PyImport_ExecCodeModuleEx()

2010-04-10 Thread Georg Brandl
Am 10.04.2010 18:12, schrieb Guido van Rossum:
> On Sat, Apr 10, 2010 at 8:58 AM, Barry Warsaw  wrote:
>> On Apr 09, 2010, at 05:41 PM, Guido van Rossum wrote:
>>
>>>On Fri, Apr 9, 2010 at 3:54 PM, Paul Moore  wrote:
 Would it be better to name this one _PyImport_ExecCodeModuleExEx (with
 an underscore) so that we *don't* need to create an ExExEx version in
 future? (Sorry, Barry :-))
>>>
>>>I don't care about what name you pick, and my ExEx proposal was meant
>>>to include half a wink, but http://docs.python.org/c-api/import.html
>>>makes it clear that PyImport_ExecCodeModuleEx() is far from private!
>>>(I don't know where Barry got that idea.)
>>
>> Note that it's the non-Ex version that's documented here.  AFAICT,
>> PyImport_ExecCodeModuleEx() is not documented.  I'm happy to fix that in my
>> branch as well.
> 
> Ah, true. And yes, please.

*cough* http://docs.python.org/dev/c-api/import.html#PyImport_ExecCodeModuleEx
Not backported to stable yet, though.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of 2.7b1?

2010-04-10 Thread Tim Golden

On 10/04/2010 17:02, Brian Curtin wrote:

I contacted David Bolen for some details about the his buildbot because I
can't reproduce the failure on any Windows XP, Server 2003, or 7 box that I
have, and it's also not a problem on the other XP buildbot. He's traveling
at the moment but will try to get me access to the box after the weekend is
over.


Might it be significant that he's running a Cygwin build of Python?
I've only run the tests on a native Win32 build myself, and I imagine
that's true for you too...


TJG
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of 2.7b1?

2010-04-10 Thread Brian Curtin
On Sat, Apr 10, 2010 at 13:37, Tim Golden  wrote:

> On 10/04/2010 17:02, Brian Curtin wrote:
>
>> I contacted David Bolen for some details about the his buildbot because I
>> can't reproduce the failure on any Windows XP, Server 2003, or 7 box that
>> I
>> have, and it's also not a problem on the other XP buildbot. He's traveling
>> at the moment but will try to get me access to the box after the weekend
>> is
>> over.
>>
>
> Might it be significant that he's running a Cygwin build of Python?
> I've only run the tests on a native Win32 build myself, and I imagine
> that's true for you too...
>
>
> TJG
>
>
The tests are run on a native Win32 build as compiled by VS2008. The
functionality is Win32 specific and wouldn't work on Cygwin, so the tests
are skipped there. I believe Cygwin is used for kicking off the tests and
other buildbot stuff, but they don't actually run through Cygwin.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] [RELEASED] 2.7 beta 1

2010-04-10 Thread Benjamin Peterson
On behalf of the Python development team, I'm merry to announce the first beta
release of Python 2.7.

Python 2.7 is scheduled (by Guido and Python-dev) to be the last major version
in the 2.x series.  Though more major releases have not been absolutely ruled
out, it's likely that the 2.7 release will an extended period of maintenance for
the 2.x series.

2.7 includes many features that were first released in Python 3.1.  The faster
io module, the new nested with statement syntax, improved float repr, set
literals, dictionary views, and the memoryview object have been backported from
3.1. Other features include an ordered dictionary implementation, unittests
improvements, a new sysconfig module, and support for ttk Tile in Tkinter.  For
a more extensive list of changes in 2.7, see
http://doc.python.org/dev/whatsnew/2.7.html or Misc/NEWS in the Python
distribution.

To download Python 2.7 visit:

 http://www.python.org/download/releases/2.7/

While this is a development release and is thus not suitable for production use,
we encourage Python application and library developers to test the release with
their code and report any bugs they encounter.

The 2.7 documentation can be found at:

 http://docs.python.org/2.7

Please consider trying Python 2.7 with your code and reporting any bugs you may
notice to:

 http://bugs.python.org


Enjoy!

--
Benjamin Peterson
2.7 Release Manager
benjamin at python.org
(on behalf of the entire python-dev team and 2.7's contributors)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-Dev Digest, Vol 81, Issue 31

2010-04-10 Thread Denis Kolodin
Hello!
My name is Denis Kolodin. I live in Russia, Tambov.
I was developing much time with C, Java, C#, R. But two month ago I'm using
Python.
It's really cool. Now, I move ALL my projects to it fully and have some
ideas which API's
extensions may will be useful.
The first thing I want to say about is an extension of CSV api. In R
language I could to set types for
the every column in a csv file. I propose to add a same function to the
Python's standard library.
Here it is (Python 3 version):

import csv
def reader2(csvfile, frame, *delimiter**=**';'*, **fmtparams):
reader = csv.reader(csvfile, delimiter=delimiter, **fmtparams)
for row in reader:
l = min(len(row), len(frame))
yield [frame[idx](row[idx]) for idx in range(l)]

This's generator function which converts an every column to the associated
type.
In *frame *argument you must to set tuple/list of functions which will uses
to
convert values in same positions of row from csv file. Frame looks like list
of types )))
By default it uses ';' delimiter to make float values conversion are
possible.

As a sample you have the csv file like:
*Any spam...; 1; 2.0; 3*

I've saved it to "sample.csv" :)

If you are using function reader in the standard "csv" module you get rows
as a list of strings :(
*>>> reader = csv.reader(open("sample.csv"), delimiter=";")*
*>>> print(next(reader))*
*['Any spam...', ' 1', ' 2.0', ' 3']*
*
*
*
It's not bad in certan situatiuons. But with "reader2" function you can get
a list with necessary types:

>>> reader = reader2(open("foodstuffs.csv"), (str, int, float, int))
>>> print(next(reader))
['Any spam...', 1, 2.0, 3]

Now you can work with items without extra conversions. [?]
I think it's good to add this function to the standard library. I've already
used it many times.
This function can be useful for many people who works with csv files.
And I suppose it conforms to "batteries included" philosophy.

What do you think about this extension?
Is it possible to add this function to standard library or to add the same
behavior to
the standard "readed" function
in "csv" Python's module?

Best Regards,
Denis Kolodin
Russia, Tambov
*


2010/4/10 

> Send Python-Dev mailing list submissions to
>[email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>http://mail.python.org/mailman/listinfo/python-dev
> or, via email, send a message with subject or body 'help' to
>[email protected]
>
> You can reach the person managing the list at
>[email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-Dev digest..."
>
>
> Today's Topics:
>
>   1. Re: PEP 3147, __cached__, and PyImport_ExecCodeModuleEx()
>  (Guido van Rossum)
>
>
> --
>
> Message: 1
> Date: Fri, 9 Apr 2010 17:41:48 -0700
> From: Guido van Rossum 
> To: Paul Moore 
> Cc: Python-Dev Dev 
> Subject: Re: [Python-Dev] PEP 3147, __cached__, and
>PyImport_ExecCodeModuleEx()
> Message-ID:
>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Fri, Apr 9, 2010 at 3:54 PM, Paul Moore  wrote:
> > On 9 April 2010 23:00, Barry Warsaw  wrote:
> >> On Apr 09, 2010, at 02:52 PM, Guido van Rossum wrote:
> >>
> >>>It may be undocumented but it doesn't start with _ and it exists to
> >>>preserve backwards compatibility. So I recommend adding
> >>>PyImport_ExecCodeModuleExEx().
> >>
> >> Cool, thanks. ?Now I can't wait for PyImport_ExecCodeModuleExExEx() :)
> >
> > Would it be better to name this one _PyImport_ExecCodeModuleExEx (with
> > an underscore) so that we *don't* need to create an ExExEx version in
> > future? (Sorry, Barry :-))
>
> I don't care about what name you pick, and my ExEx proposal was meant
> to include half a wink, but http://docs.python.org/c-api/import.html
> makes it clear that PyImport_ExecCodeModuleEx() is far from private!
> (I don't know where Barry got that idea.) While Google Code Search
> finds mostly references to PyImport_ExecCodeModuleEx in the Python
> source code and various copies of it, it also shows some real uses,
> e.g.
>
> http://www.google.com/codesearch/p?hl=en#bkFK9YpaWlI/ubuntu/pool/universe/y/yehia/yehia_0.5.4.orig.tar.gz|PZ0_Xf7QzC0/yehia-0.5.4.orig/plugins/python/python-loader.cc&q=PyImport_ExecCodeModuleEx
>
> --
> --Guido van Rossum (python.org/~guido)
>
>
> --
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
>
>
> End of Python-Dev Digest, Vol 81, Issue 31
> **
>
<<360.gif>>___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tuning Python dicts

2010-04-10 Thread Martin v. Löwis
> Any other advice would also be helpful.

I may be missing the point, but ISTM that the assumption of this
approach is that there are often collisions in the hash table. I think
that assumption is false; at least, I recommend to validate that
assumption before proceeding.

There are, of course, cases where dicts do show collisions (namely when
all keys hash equal), however, I'm uncertain whether the approach would
help in that case.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] 2.7 beta 1

2010-04-10 Thread Antoine Pitrou
Benjamin Peterson  python.org> writes:
> 
> On behalf of the Python development team, I'm merry to announce the first beta
> release of Python 2.7.

Congratulations, and thanks for your patience :)



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tuning Python dicts

2010-04-10 Thread Reid Kleckner
On Sat, Apr 10, 2010 at 2:57 PM, "Martin v. Löwis"  wrote:
>> Any other advice would also be helpful.
>
> I may be missing the point, but ISTM that the assumption of this
> approach is that there are often collisions in the hash table. I think
> that assumption is false; at least, I recommend to validate that
> assumption before proceeding.

It's just an experiment for a class, not something I am (yet)
seriously thinking about contributing back to CPython.  I think my
chances of improving over the current implementation are slim.  I do
not have that much hubris.  :)  I would just rather do experimental
rather than theoretical work with data structures.

I think you're right about the number of collisions, though.  CPython
dicts use a pretty low load factor (2/3) to keep collision counts
down.  One of the major benefits cited in the paper is the ability to
maintain performance in the face of higher load factors, so I may be
able to bump up the load factor to save memory.  This would increase
collisions, but then that wouldn't matter, because resolving them
would only require looking within two consecutive cache lines.

> There are, of course, cases where dicts do show collisions (namely when
> all keys hash equal), however, I'm uncertain whether the approach would
> help in that case.

Yes, in fact, hopscotch hashing fails completely as I mentioned at the
end of my last message.

Reid
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tuning Python dicts

2010-04-10 Thread Antoine Pitrou
Reid Kleckner  mit.edu> writes:
> 
> I think you're right about the number of collisions, though.  CPython
> dicts use a pretty low load factor (2/3) to keep collision counts
> down.  One of the major benefits cited in the paper is the ability to
> maintain performance in the face of higher load factors, so I may be
> able to bump up the load factor to save memory.  This would increase
> collisions, but then that wouldn't matter, because resolving them
> would only require looking within two consecutive cache lines.

Why wouldn't it matter? Hash collisions still involve more CPU work, even though
if you're not access memory a lot.


Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tuning Python dicts

2010-04-10 Thread Antoine Pitrou
Antoine Pitrou  pitrou.net> writes:
> 
> Why wouldn't it matter? Hash collisions still involve more CPU work, even
though
> if you're not access memory a lot.

(sorry for the awful grammar in the last sentence)



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tuning Python dicts

2010-04-10 Thread Reid Kleckner
On Sat, Apr 10, 2010 at 4:40 PM, Antoine Pitrou  wrote:
> Reid Kleckner  mit.edu> writes:
>>
>> I think you're right about the number of collisions, though.  CPython
>> dicts use a pretty low load factor (2/3) to keep collision counts
>> down.  One of the major benefits cited in the paper is the ability to
>> maintain performance in the face of higher load factors, so I may be
>> able to bump up the load factor to save memory.  This would increase
>> collisions, but then that wouldn't matter, because resolving them
>> would only require looking within two consecutive cache lines.
>
> Why wouldn't it matter? Hash collisions still involve more CPU work, even 
> though
> if you're not access memory a lot.

So we know for sure that extra CPU work to avoid cache misses is a big
win, but it's never clear whether or not any given memory access will
be a miss.  Because Python's access patterns are rarely random, it may
turn out that all of the elements it accesses are in cache and the
extra CPU work dominates.  If you have a random access pattern over a
large dataset, the dictionary will not fit in cache, and saving random
memory accesses at the expense of CPU will be a win.

Reid
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] 2.7 beta 1

2010-04-10 Thread average
> On behalf of the Python development team, I'm merry to announce the first beta
> release of Python 2.7.
>
> Python 2.7 is scheduled (by Guido and Python-dev) to be the last major version
> in the 2.x series.  Though more major releases have not been absolutely ruled
> out, it's likely that the 2.7 release will an extended period of maintenance 
> for
> the 2.x series.

May I propose that the developers consider keeping this release *beta*
until after the present Python moratorium?  That is, don't let it be
marked as *official* until after, say, Python 3.3.

There are so many features taken from 3.0 that I fear that it will
postpone its adoption interminably (it is, in practice, treated as
"beta" software itself).  By making it doctrine that it won't be
official until the next "major" Python release, it will encourage
those who are able, to just make the jump to 3.0, while those who
cannot will have the subtle pressure to make the shift, however
gradual.  Additionally, it will give the community further incentive
to make Python3 all that it was intended to be.  Personally, the
timing of v3 prevented me from fully participating in that effort,
and, not ignoring the work of those who did contribute, I think many
of us feel that it has not reached its potential.

Just a small suggestion... .. .

marcos
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147, __cached__, and PyImport_ExecCodeModuleEx()

2010-04-10 Thread Barry Warsaw
On Apr 10, 2010, at 08:28 PM, Georg Brandl wrote:

>Am 10.04.2010 18:12, schrieb Guido van Rossum:
>> On Sat, Apr 10, 2010 at 8:58 AM, Barry Warsaw  wrote:
>>> On Apr 09, 2010, at 05:41 PM, Guido van Rossum wrote:
>>>
On Fri, Apr 9, 2010 at 3:54 PM, Paul Moore  wrote:
> Would it be better to name this one _PyImport_ExecCodeModuleExEx (with
> an underscore) so that we *don't* need to create an ExExEx version in
> future? (Sorry, Barry :-))

I don't care about what name you pick, and my ExEx proposal was meant
to include half a wink, but http://docs.python.org/c-api/import.html
makes it clear that PyImport_ExecCodeModuleEx() is far from private!
(I don't know where Barry got that idea.)
>>>
>>> Note that it's the non-Ex version that's documented here.  AFAICT,
>>> PyImport_ExecCodeModuleEx() is not documented.  I'm happy to fix that in my
>>> branch as well.
>> 
>> Ah, true. And yes, please.
>
>*cough* http://docs.python.org/dev/c-api/import.html#PyImport_ExecCodeModuleEx
>Not backported to stable yet, though.

'dev' is Python 2.7 (i.e. trunk) right?  I was looking at the py3k url, which
it hasn't been ported to either yet I think.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] 2.7 beta 1

2010-04-10 Thread Melton Low
On Sat, Apr 10, 2010 at 4:13 PM, average  wrote:

> > On behalf of the Python development team, I'm merry to announce the first
> beta
> > release of Python 2.7.
> >
> > Python 2.7 is scheduled (by Guido and Python-dev) to be the last major
> version
> > in the 2.x series.  Though more major releases have not been absolutely
> ruled
> > out, it's likely that the 2.7 release will an extended period of
> maintenance for
> > the 2.x series.
>
> May I propose that the developers consider keeping this release *beta*
> until after the present Python moratorium?  That is, don't let it be
> marked as *official* until after, say, Python 3.3.
>
> There are so many features taken from 3.0 that I fear that it will
> postpone its adoption interminably (it is, in practice, treated as
> "beta" software itself).  By making it doctrine that it won't be
> official until the next "major" Python release, it will encourage
> those who are able, to just make the jump to 3.0, while those who
> cannot will have the subtle pressure to make the shift, however
> gradual.  Additionally, it will give the community further incentive
> to make Python3 all that it was intended to be.  Personally, the
> timing of v3 prevented me from fully participating in that effort,
> and, not ignoring the work of those who did contribute, I think many
> of us feel that it has not reached its potential.
>
> Just a small suggestion... .. .
>
> marcos
> --
> http://mail.python.org/mailman/listinfo/python-list
>

I disagree.  2.7 should go GA as soon as the developers deemed it stable.

Those who don't need 3rd party packages will no doubt migrate to 3.x.  Those
that required 3rd party packages not yet ported to 3.x will want to use 2.7.
 Delaying 2.7 from GA doesn't change the reality.  I myself would want to
use back ported features from 2.7 as a way to prepare for migration as soon
as those 3rd party packages are ported to 3.x.

Mel
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tuning Python dicts

2010-04-10 Thread Terry Reedy



I may be missing the point, but ISTM that the assumption of this
approach is that there are often collisions in the hash table. I think
that assumption is false; at least, I recommend to validate that
assumption before proceeding.


It's just an experiment for a class, not something I am (yet)
seriously thinking about contributing back to CPython.  I think my
chances of improving over the current implementation are slim.  I do
not have that much hubris.  :)  I would just rather do experimental
rather than theoretical work with data structures.

I think you're right about the number of collisions, though.  CPython
dicts use a pretty low load factor (2/3) to keep collision counts
down.


There was a paper discussing Python dicts at PyCon 2010. I believe it 
included some data on collisions. I posted the link on Python list a 
couple of weeks ago.


Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-Dev Digest, Vol 81, Issue 31

2010-04-10 Thread Terry Reedy

On 4/10/2010 2:53 PM, Denis Kolodin wrote:


The first thing I want to say about is an extension of CSV api.


I believe speculative proposals like this fit better on the python-list 
or python-ideas list.


tjr

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com