Re: [Python-Dev] Mercurial migration: progress report (PEP 385)

2009-07-05 Thread Georg Brandl
Martin v. Löwis schrieb:
>> We could add another value in the tuple that specifies the VCS:
>> ('CPython', 'branches/release25-maint', '61464', 'svn'). I agree that
>> VCSs are not universally the same, but the concept of a revision is
>> universal.
> 
> Actually, I think that's not the case. For bzr, the usual way of
> identifying a revision is by revision number, which, however, is not
> unique within a project, as each branch will use contiguous integers
> for numbers. There are also unique identifications - so a bzr revision
> has actually two numbers.
> 
> More general, in a DVCS, it is not possible to access the revision being
> referred to by such a tuple. For sys.subversion, if [0]=='CPython', then
> you could go to svn.python.org. For a DVCS, the revision being
> identified may not be publically available, or may not live on a host
> that you can infer from your proposed sys.revision.

At least you can tell that if the given hash is not present in the mainline
repo, the build contains something that doesn't come from python.org.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial migration: progress report (PEP 385)

2009-07-05 Thread Georg Brandl
Martin v. Löwis schrieb:
>> It will handle it, for sure, but I think it would all go easier if we
>> could work with stricter subset branches (and leave the effective
>> cherrypicking for the occasional problem).
> 
> So I think the PEP should propose a workflow (or: merge flow) if you
> think we would be better off with a different one.
> 
> In proposing such a workflow, consider these requirements:
> - we current have four active "maintenance" branches (i.e. where
>   the entire code basis evolves): trunk, 3k, 2.6, and 3.1 (3.0
>   also until this morning).

It seems that there is a consensus to separate the 2.x and 3.x repos,
and that also makes sense to me.

Using named branches for the maintenance branches should be possible,
but I'm not opposed to using cloned branches either.  What I really
want to see is the common-subset approach for maintenance branches.

Changesets still have to be transplanted from 2.x to 3.x or the other
way round.

> - in addition, we have two security branches currently: 2.4 and
>   2.5, although 2.4 will be closed soon.
> - our committers consistently refuse to merge changes across
>   branches themselves, and likely continue to do so unless there
>   is some feature of hg that I missed (e.g. one were merging
>   would happen without any user specifically asking for it)

If the checkin is done in the proper (the maint) branch, at least merging
of that change is automatic whenever someone does a hg merge.

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Tarek Ziadé
2009/7/4 Paul Moore :
> 2009/7/4 Paul Moore :
>> 2009/7/3 Tarek Ziadé :
>>> You can give me a bitbucket account so I can give you write access to the 
>>> repo,
>>> There are tests as long as you install Nose.
>>
>> How do I get the tests to work? Just running nosetests gives an error
>> (probably because pkgutil is being imported from the stdlib, rather
>> than from this directory).
>>

I just run them from within the directory

>> If I set PYTHONPATH=. then I get errors. I suspect path normalisation
>> (for backslashes) in the zipfile handling.

>
> Actually, the test
>
>    assert_equals(list(dist.get_egginfo_files(local=True)),
>                  [os.path.join(SITE_PKG, 'mercurial-1.0.1.egg-info/PKG_INFO'),
>                   os.path.join(SITE_PKG, 'mercurial-1.0.1.egg-info/RECORD')])
>
> is broken, because the expected value uses slashes, which are *not*
> the local separator on win32.
>
> I've attached a patch.

Applied, thanks (I didn't run them under win32 yet)


>
> But there's 2 comments I'd make (one minor, one major)
>
> Minor one: The tests often seem to be exercising the internal classes,
> not so much the public API, so many of them will probably not be of
> much use to me :-(

I'll add some more tests then, or even user stories.

> I think you need some real-world use cases, with actual sample
> (pseudo-)code, to validate the design here. As things stand, it's both
> confusing and (I suspect) unusable in practice. Sorry, I know that
> sounds negative, but if this isn't to be a source of subtle bugs for
> years to come, it needs to be clarified now. PEP 302 is still hitting
> this type of issue - runpy and importlib have brought out errors and
> holes in the protocol quite recently - even though Just and I went to
> great lengths to try to tease out hidden assumptions up front.

Agreed, the zip case was added afterwards, but in practice, the APIs are still
dealing with the files are *filesystem files* located in a container
(eg a directory or a zip file)
located somewhere on the filesystem.

"local" in that case is a flag that means "translate a file path
expressed in the local filesystem"
which make no sense anymore with zip files. But the goal really, is to
be able to point out
that two distributions are using the very same file.

Right now PEP 376 and the prototype code handle these two real world use cases:

- browsing regular site-packages-like directories
- browsing site-packages-like directories, that are zipped.

For example:

- I have a  "packages.zip" file in /var/, wich is also in my sys.path.
It contains a distribution "foo-1.0" that has the "roman.py" file in
its root.  So the RECORD file located in "foo-1.0.egg-info" has a line
starting with "roman.py,..."

- Then if I install docutils 0.5 as a regular filesystem distribution,
"roman.py" will be added in Python's site-packages.
  and docutils-0.5.egg-info/RECORD will contain "roman.py,..." with
the same hash.

The local flag will return these paths:

- /var/packages.zip/roman.py   <--- not a "real" path
- /usr/local/lib/python2.6/site-packages/roman.py

So removing the docutils distribution will be doable, because these
paths are different.

>
> Concrete proposal:
>
> get_metadata_files() - returns slash-separated names, relative to the
> egginfo dir
> get_metadata_file(path) - path must be slash-separated, relative to
> the egginfo dir
>
> get_installed_files - returns the contents of RECORD unaltered
> uses(path) - checks if path is in RECORD
>
> The latter 2 are not very useful in practice - you can't say anything
> about entries in different RECORD files, which is likely the real use
> case you want. Maybe RECORD could have an extra "Location" entry,
> which determines where it exists globally (this would be the directory
> to which the filenames were relative, in the case of filesystem-based
> distributions) and RECORD entries are comparable if the Location
> values in the 2 RECORD files match. That's a lot more complex - but
> depending on what use people expect to make of these 2 APIs, it may be
> justified.

Yes,
In practice, if you look at my previous example, even if
"/var/packages.zip/roman.py" isn't a
real path, it's enough to compare RECORD entries globally.

The "Location" entry you are proposing in that case, would be
"/var/packages.zip".

But do we really need to store it the RECORD  ? Or can't we define an
API that returns
two elements :

- the path to the location (in the example: /var/packages.zip or
/usr/local/lib/python2.6/site-packages)
- the path within the location itself (in the example: roman.py)

A concrete proposal would be to take back your proposal, but return
tuples with the location as the first member.
e.g. "(location, relative path[s])"

The code that is comparing paths to see if they are the same can join
location+relative path[s], while
we can provide in a dedicated function something to read the content
of the file (that would be get_data I guess,
if I refer to PEP 302)

Tarek
_

Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Tarek Ziadé
On Sat, Jul 4, 2009 at 3:04 PM, Paul Moore wrote:
> 2009/7/3 Tarek Ziadé :
>> 2009/7/3 Paul Moore :
>>> Does this sound sensible? Tarek, would you be OK with me attempting to
>>> modify your prototype to support this protocol? Are there any tests
>>> for PEP 376, so that I can confirm I haven't completely broken
>>> something? If I can, I'll knock up some simple prototype importers for
>>> non-standard examples, and see how they work with all this.
>>
>> Yes that's exactly what I was thinking about after the discussion we
>> had in the other thread. This change would allow much more flexibility.
>
> One important note - I plan on using the fact that DistributionDirMap
> is not public, and hacking it about drastically, or possibly even
> removing it. (After all, the likes of the load method don't make sense
> when you've got sys.meta_path, sys.path_importer_cache and the like to
> consider).
>
> If the loss of any of the "internal" classes is an issue, say so now!

Go for it please, and let me know if you set a bitbucket account, so
you can push your commits in there directly


>
> Paul.
>



-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Paul Moore
2009/7/5 Tarek Ziadé :
> Go for it please, and let me know if you set a bitbucket account, so
> you can push your commits in there directly

My bitbucket account is 'pmoore'.

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Tarek Ziadé
2009/7/4 Brett Cannon :
>>
>> P.S. +lots on using 'metadata' in the PEP 376 method names rather than
>> the jargon 'egginfo'. Jargon isn't always bad, but using it seems fairly
>> gratuitous in this case.
>
> Ditto from here. Plus I have an aversion to terminology that goes down the
> reptile route instead of the Monty Python route.

If it turns out that we use PEP 302-like loaders, I am also suggesting
that the default metadata directory name
used in Distutils is changed to "DIST_NAME.metadata".

The loader would still work with "DIST_NAME.egg-info" directories for
compatibility with
existing format in the query APIs, but the Distutils install command
would rather create  "DIST_NAME.metadata"
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Tarek Ziadé
On Sun, Jul 5, 2009 at 3:13 PM, Paul Moore wrote:
> 2009/7/5 Tarek Ziadé :
>> Go for it please, and let me know if you set a bitbucket account, so
>> you can push your commits in there directly
>
> My bitbucket account is 'pmoore'.

You're set.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Paul Moore
2009/7/5 Tarek Ziadé :
> Agreed, the zip case was added afterwards, but in practice, the APIs are still
> dealing with the files are *filesystem files* located in a container (eg a 
> directory
> or a zip file) located somewhere on the filesystem.
>
> "local" in that case is a flag that means "translate a file path expressed in 
> the
> local filesystem" which make no sense anymore with zip files. But the goal 
> really,
> is to be able to point out that two distributions are using the very same 
> file.
>
> Right now PEP 376 and the prototype code handle these two real world use 
> cases:
>
> - browsing regular site-packages-like directories
> - browsing site-packages-like directories, that are zipped.
>
> For example:
>
> - I have a  "packages.zip" file in /var/, wich is also in my sys.path.
> It contains a distribution "foo-1.0" that has the "roman.py" file in
> its root.  So the RECORD file located in "foo-1.0.egg-info" has a line
> starting with "roman.py,..."
>
> - Then if I install docutils 0.5 as a regular filesystem distribution,
> "roman.py" will be added in Python's site-packages.
>  and docutils-0.5.egg-info/RECORD will contain "roman.py,..." with
> the same hash.
>
> The local flag will return these paths:
>
> - /var/packages.zip/roman.py   <--- not a "real" path
> - /usr/local/lib/python2.6/site-packages/roman.py
>
> So removing the docutils distribution will be doable, because these
> paths are different.
>
>>
>> Concrete proposal:
>>
>> get_metadata_files() - returns slash-separated names, relative to the
>> egginfo dir
>> get_metadata_file(path) - path must be slash-separated, relative to
>> the egginfo dir
>>
>> get_installed_files - returns the contents of RECORD unaltered
>> uses(path) - checks if path is in RECORD
>>
>> The latter 2 are not very useful in practice - you can't say anything
>> about entries in different RECORD files, which is likely the real use
>> case you want. Maybe RECORD could have an extra "Location" entry,
>> which determines where it exists globally (this would be the directory
>> to which the filenames were relative, in the case of filesystem-based
>> distributions) and RECORD entries are comparable if the Location
>> values in the 2 RECORD files match. That's a lot more complex - but
>> depending on what use people expect to make of these 2 APIs, it may be
>> justified.
>
> Yes,
> In practice, if you look at my previous example, even if
> "/var/packages.zip/roman.py" isn't a
> real path, it's enough to compare RECORD entries globally.
>
> The "Location" entry you are proposing in that case, would be
> "/var/packages.zip".
>
> But do we really need to store it the RECORD  ? Or can't we define an
> API that returns
> two elements :
>
> - the path to the location (in the example: /var/packages.zip or
> /usr/local/lib/python2.6/site-packages)
> - the path within the location itself (in the example: roman.py)
>
> A concrete proposal would be to take back your proposal, but return
> tuples with the location as the first member.
> e.g. "(location, relative path[s])"

That sounds reasonable. So we can forget the "local" parameter, and
return a tuple:

- absolute location of the container (directory, zipfile or whatever
containing the egginfo file) as a filesystem path in canonical native
form (where it's filesystem based) or as an opaque token for the odd
cases (frozen modules, for example) where a filesystem location isn't
available.
- entry from the RECORD file, as a slash-separated filename relative
to the root of the container.

> The code that is comparing paths to see if they are the same can join
> location+relative path[s], while we can provide in a dedicated function
> something to read the content of the file (that would be get_data I guess,
> if I refer to PEP 302)

Unfortunately, get_data loads data files located within a *package*,
using a name relative to the package directory. You can't get at the
metadata of a *distribution* like that.

But if you're using get_installed_files(), why would you then want to
read the files? What exactly would you *use* get_installed_files for
which would then leave you needing to read the files? If it's to check
they haven't changed (by comparing md5 values) you're doing that to
uninstall, so that's the responsibility of the uninstall function.

Again, it's a question of what is a public API, and what is the use
case it's designed for.

I'm currently writing a SQLite importer, which will allow me to store
"files" in any sort of database tables I want, so I can build in some
nice pathological behaviour. That should tease out some awkward corner
cases :-)

Paul
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Tarek Ziadé
On Sun, Jul 5, 2009 at 3:27 PM, Paul Moore wrote:
>>
>> A concrete proposal would be to take back your proposal, but return
>> tuples with the location as the first member.
>> e.g. "(location, relative path[s])"
>
> That sounds reasonable. So we can forget the "local" parameter, and
> return a tuple:
>
> - absolute location of the container (directory, zipfile or whatever
> containing the egginfo file) as a filesystem path in canonical native
> form (where it's filesystem based) or as an opaque token for the odd
> cases (frozen modules, for example) where a filesystem location isn't
> available.
> - entry from the RECORD file, as a slash-separated filename relative
> to the root of the container.

exactly,

>
> But if you're using get_installed_files(), why would you then want to
> read the files? What exactly would you *use* get_installed_files for
> which would then leave you needing to read the files? If it's to check
> they haven't changed (by comparing md5 values) you're doing that to
> uninstall, so that's the responsibility of the uninstall function.
>
> Again, it's a question of what is a public API, and what is the use
> case it's designed for.

Right. These APIs were created for third-party package managers.
One use case of a package manager is the uninstallation, but I have no other
use case in mind.

>
> I'm currently writing a SQLite importer, which will allow me to store
> "files" in any sort of database tables I want, so I can build in some
> nice pathological behaviour. That should tease out some awkward corner
> cases :-)

Sounds good.

Semi-related: even if the files themselves are in the filesystem,
having a sqlite db to
index the list of installed distributions makes a good cache solution
to reduce the disk I/O
and speed up the query functions.

So maybe we could use a disk-based cache for site-packages-like
directories in the
form of a sqlite db. That's what I am experimenting on my side.


>
> Paul
>



-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial migration: progress report (PEP 385)

2009-07-05 Thread Stephen J. Turnbull
Georg Brandl writes:

 > What I really want to see is the common-subset approach for
 > maintenance branches.

IMO, this unfortunately unlikely to work as you seem to expect, for a
two technical reasons and for a social reason.  The first technical
reason is that a maintenance branch really is a branch, not a subset.
The fix that is appropriate for a maintenance branch is often
inappropriate in detail for the mainline, and vice versa.  But
Mercurial doesn't care about "in detail" vs. "in design"; both will
result in a conflict the first time that branch is merged.

Second, some fixes for the maintenance branch will simply not be
appropriate for the development branch, as the problem has already
been fixed "en passant" by some other change.  This can probably be
handled by doing what git calls an "ours" merge to make it look like
the unnecessary patch is an ancestor of the tip, even though no code
was actually applied to the mainline.  However, this kind of operation
is some what delicate, and even if it's mostly scripted, it's likely
to be somewhat unreliable for people who don't use the script very
often ... which leads to the social problem:

 > > - our committers consistently refuse to merge changes across
 > >   branches themselves, and likely continue to do so unless there
 > >   is some feature of hg that I missed (e.g. one were merging
 > >   would happen without any user specifically asking for it)
 > 
 > If the checkin is done in the proper (the maint) branch, at least merging
 > of that change is automatic whenever someone does a hg merge.

Maybe.  But I see two problematic sides to this from the social point
of view, which is the same old problem really.

First, to the extent that it doesn't run into the technical problems,
it encourages people to *not* review patches for each branch they are
committed to.  "It will get automerged anyway."  Anything that
discourages review is a bad thing.  It will cause the development
branch to "age" faster because of accumulation of crufty patches that
are good enough as minimally invasive fixes for bugs in a maintenance
branch, but which should be more robust for the development branch.

I think you will also find that some people are not particularly
interested in fixing the maintenance branch for some of their patches
for exactly the same reasons they currently don't, and they will
continue to refuse to do the work to commit in the maintenance branch
first.  Especially after the first time they run into one of the
technical problems described above.

In the end, any policy to encourage a "subset" policy is likely to end
up as a burden on the same people who currently do the cross branch
merging.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 376 - get_egginfo_files

2009-07-05 Thread Paul Moore
The PEP says:

"""
get_egginfo_files(local=False) -> iterator of paths

Iterates over the RECORD entries and return paths for each line if the
path is pointing a file located in the .egg-info directory or one of
its subdirectory.
"""

Should this method really only return filenames noted in the RECORD
file? Would it not be better for it to iterate over *all* files in the
.egg-info directory? I understand that there shouldn't, in practice,
be any files in that directory *not* mentioned in the RECORD file, but
given that we already have get_installed_files to read the RECORD
file, I would imagine it's better for this file to so something more
than filter the return values from get_installed_files.

Actually, on that note, consider the pkgutil functions:

def get_distribution(name):
for d in get_distributions():
if d.name == name:
return d
return None

def get_file_users(path):
for d in get_distributions():
if d.uses(path):
yield d

These don't actually add much to the API. While I can see the
advantage of having them as convenience methods, it might be worth
pointing out in the PEP that that's all they are.

Similarly, how valuable is Distribution.name, given that it's the same
as Distribution.metadata.name? I'm probably just going to make it a
property -

@property
def name(self):
return self.metadata.name

but that's actually slower than just using self.metadata.name
directly, so it's a bit of an attractive nuisance, and I'd prefer it
if it wasn't present. (For the PEP 302 stuff, I'm making metadata a
cached property, so name *has* to be a property to ensure that the
metadata cache is managed properly...)

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] 2.6.3 unittest change breaks nose (issue 6418)

2009-07-05 Thread jason pellerin
Bringing python-dev into the discussion at Barry's request. The
summary is that a recent change to unittest.TestProgram breaks nose by
moving self.testRunner initialization from it's old home in
TestProgram.runTests to TestProgram.__init__. The very small patch
attached to the ticket moves it back to runTests.

Here's the ticket: http://bugs.python.org/issue6418
And a link to the testing in python list discussion:

http://lists.idyll.org/pipermail/testing-in-python/2009-July/002032.html

JP (primary author of nose)


PGP.sig
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial migration: progress report (PEP 385)

2009-07-05 Thread Martin v. Löwis
>> In proposing such a workflow, consider these requirements:
> 
> It seems that there is a consensus to separate the 2.x and 3.x repos,
> and that also makes sense to me.

(I think) I wasn't primarily talking about the representation of
branches in hg - to that, I fully trust recommendations from hg users
and experts.

What will need debate and discussion in the PEP is the workflow, ie.
the order in which changes should be applied to the branches.

> If the checkin is done in the proper (the maint) branch, at least merging
> of that change is automatic whenever someone does a hg merge.

I probably don't fully understand, but that seems to imply a workflow
were all changes made to one branch should also automatically be
integrated into a different branch. I'm curious as to how such a
mechanism can apply to Python.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 - get_egginfo_files

2009-07-05 Thread P.J. Eby

At 05:26 PM 7/5/2009 +0100, Paul Moore wrote:

def get_distribution(name):
for d in get_distributions():
if d.name == name:
return d
return None


Btw, this is broken code anyway, because it's not handling 
case-insensitivity or name canonicalization.  (I've mentioned these 
issue previously on the distutils-sig.)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 - get_egginfo_files

2009-07-05 Thread Tarek Ziadé
2009/7/5 Paul Moore :
> The PEP says:
>
> """
> get_egginfo_files(local=False) -> iterator of paths
>
> Iterates over the RECORD entries and return paths for each line if the
> path is pointing a file located in the .egg-info directory or one of
> its subdirectory.
> """
>
> Should this method really only return filenames noted in the RECORD
> file? Would it not be better for it to iterate over *all* files in the
> .egg-info directory?
> I understand that there shouldn't, in practice,
> be any files in that directory *not* mentioned in the RECORD file, but
> given that we already have get_installed_files to read the RECORD
> file, I would imagine it's better for this file to so something more
> than filter the return values from get_installed_files.

I don't see a use case for having more out of get_egginfo_files.
I still find it useful because to iterate over metadata files.

Maybe we could remove it and add a filter option for get_installed_files.
A callable that gets each visited file and returns True or False to
filter them out:

  get_installed_files(path, filter=callable)

And then provide a "egginfo_files" callable to get what we have with
get_egginfo_files :

  get_installed_files(path, filter=egginfo_files)


>
> Actually, on that note, consider the pkgutil functions:
>
> def get_distribution(name):
>    for d in get_distributions():
>        if d.name == name:
>            return d
>    return None
>
> def get_file_users(path):
>    for d in get_distributions():
>        if d.uses(path):
>            yield d
>
> These don't actually add much to the API. While I can see the
> advantage of having them as convenience methods, it might be worth
> pointing out in the PEP that that's all they are.

Sure,

>
> Similarly, how valuable is Distribution.name, given that it's the same
> as Distribution.metadata.name? I'm probably just going to make it a
> property -

It's just for conveniency, since this metadata field is also the
identifier of the distribution.

>
> @property
> def name(self):
>    return self.metadata.name

I don't think this adds any value, since self.metadata is a read-only instance,
that gets loaded once when the Distribution object is created.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread P.J. Eby

At 03:13 PM 7/5/2009 +0200, Tarek Ziadé wrote:

The loader would still work with "DIST_NAME.egg-info" directories for
compatibility with
existing format in the query APIs, but the Distutils install command
would rather create  "DIST_NAME.metadata"


Note that this would then break setuptools without adding any 
benefit; ".metadata" is less precise and less unique than 
'.egg-info'.  If you want a clearer name, '.pydist' or some such 
would at least be reasonably specific.  (It'd still have a backward 
compatibility problem, but at least then there'd be some benefit to 
the name change.)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 - get_egginfo_files

2009-07-05 Thread Tarek Ziadé
2009/7/5 P.J. Eby :
> At 05:26 PM 7/5/2009 +0100, Paul Moore wrote:
>>
>> def get_distribution(name):
>>    for d in get_distributions():
>>        if d.name == name:
>>            return d
>>    return None
>
> Btw, this is broken code anyway, because it's not handling
> case-insensitivity or name canonicalization.  (I've mentioned these issue
> previously on the distutils-sig.)

Yes thanks, we need to fix that, the case-insensitivity or name
canonicalization functions are present, just to be used
in that function too


>
>



-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Tarek Ziadé
2009/7/5 P.J. Eby :
> At 03:13 PM 7/5/2009 +0200, Tarek Ziadé wrote:
>>
>> The loader would still work with "DIST_NAME.egg-info" directories for
>> compatibility with
>> existing format in the query APIs, but the Distutils install command
>> would rather create  "DIST_NAME.metadata"
>
> Note that this would then break setuptools without adding any benefit;
> ".metadata" is less precise and less unique than '.egg-info'.

But if it's based on PEP 302 protocols and if the pkgutil code works
with the sys.meta_path hook,
setuptools could then provide its loader, based on its EggFormats and
act as a provider without being broken.

In that case, Distutils could provide a standard loader, with the
change I mentioned.


> If you want a
> clearer name, '.pydist' or some such would at least be reasonably specific.
>  (It'd still have a backward compatibility problem, but at least then
> there'd be some benefit to the name change.)

I do find "DIST_NAME.metadata" well-named and specific. But I guess that's just
bikeshedding :)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 - get_egginfo_files

2009-07-05 Thread Paul Moore
2009/7/5 P.J. Eby :
> At 05:26 PM 7/5/2009 +0100, Paul Moore wrote:
>>
>> def get_distribution(name):
>>    for d in get_distributions():
>>        if d.name == name:
>>            return d
>>    return None
>
> Btw, this is broken code anyway, because it's not handling
> case-insensitivity or name canonicalization.  (I've mentioned these issue
> previously on the distutils-sig.)

Fair point. (Although I don't recall your distutils-sig posting, so
I'm not sure what you mean by "name canonicalisation").

Note that even on case insensitive filesystems, module/package names
are handled case sensitively. I would be happy to see distribution
names handled the same (although I have no vested interest either
way).

Is there code around to handle filename matching based on the case
sensitivity of the filesystem? (My understanding is that there isn't,
and programs like Mercurial play fancy games to determine if a
filesystem is case sensitive before doing tests). Of course, if you're
OK with an inaccurate but simple choice based on OS, it would probably
be OK to use os.path.normcase.

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 - get_egginfo_files

2009-07-05 Thread Paul Moore
2009/7/5 Tarek Ziadé :
> 2009/7/5 Paul Moore :
>> The PEP says:
>>
>> """
>> get_egginfo_files(local=False) -> iterator of paths
>>
>> Iterates over the RECORD entries and return paths for each line if the
>> path is pointing a file located in the .egg-info directory or one of
>> its subdirectory.
>> """
>>
>> Should this method really only return filenames noted in the RECORD
>> file? Would it not be better for it to iterate over *all* files in the
>> .egg-info directory?
>> I understand that there shouldn't, in practice,
>> be any files in that directory *not* mentioned in the RECORD file, but
>> given that we already have get_installed_files to read the RECORD
>> file, I would imagine it's better for this file to so something more
>> than filter the return values from get_installed_files.
>
> I don't see a use case for having more out of get_egginfo_files.
> I still find it useful because to iterate over metadata files.
>
> Maybe we could remove it and add a filter option for get_installed_files.
> A callable that gets each visited file and returns True or False to
> filter them out:
>
>  get_installed_files(path, filter=callable)
>
> And then provide a "egginfo_files" callable to get what we have with
> get_egginfo_files :
>
>  get_installed_files(path, filter=egginfo_files)

-1. Unnecessary generalisation. Let's stick with the 2 functions as documented.

[...]
>> Similarly, how valuable is Distribution.name, given that it's the same
>> as Distribution.metadata.name? I'm probably just going to make it a
>> property -
>
> It's just for conveniency, since this metadata field is also the
> identifier of the distribution.
>
>>
>> @property
>> def name(self):
>>    return self.metadata.name
>
> I don't think this adds any value, since self.metadata is a read-only 
> instance,
> that gets loaded once when the Distribution object is created.

... not any more :-)

Your zipfile handling was horribly broken on Windows, thanks to the
usual slash/backslash confusion. The sanest way to fix it seemed to me
to be to load the metadata lazily, rather than in the __init__ (as
otherwise, zipfile and filesystem implementation end up not being able
to share any code). Once that's done, the name attribute has to *also*
handle lazy-loading of the metadata, and the above property is the
easiest way to do this.

Actually, my implementation is looking less and less like yours, and
ultimately any implementation questions are irrelevant until you see
my code and spot all the errors :-) I'm trying to get it into a
postable state as fast as I can. (At last count, I've replaced about
140 lines of code with 70, and it now includes PEP 302 support all the
(non-internal) tests still pass. So it's looking OK...)

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial migration: progress report (PEP 385)

2009-07-05 Thread average
>> Is it really that confusing? I have never heard of anyone asking "what
>> is py3k?"
>
> Do you read python-list? It has been asked. Also, some people seem to
> think that py3k is different from python 3.

Personally, I vote for keeping the "3k" for 3000 (or is it 3072?).  I
believe that py3k represents a ideal that hasn't been reached, despite
being hoped for in python3.  By keeping it, it confers the idea
continual evolution *within* the language until that hypothetical
ideal is reached.  Clearly, there are times when a language reaches
only a local maximum, and must depart from itself to arrive at a more
global optimum (an annealing problem in the minimization of
frustration energy).  If py3k wasn't kept, another term would
eventually need to be invented.

marcos
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial migration: progress report (PEP 385)

2009-07-05 Thread Benjamin Peterson
2009/7/5 average :
>>> Is it really that confusing? I have never heard of anyone asking "what
>>> is py3k?"
>>
>> Do you read python-list? It has been asked. Also, some people seem to
>> think that py3k is different from python 3.
>
> Personally, I vote for keeping the "3k" for 3000 (or is it 3072?).  I
> believe that py3k represents a ideal that hasn't been reached, despite
> being hoped for in python3.  By keeping it, it confers the idea
> continual evolution *within* the language until that hypothetical
> ideal is reached.  Clearly, there are times when a language reaches
> only a local maximum, and must depart from itself to arrive at a more
> global optimum (an annealing problem in the minimization of
> frustration energy).  If py3k wasn't kept, another term would
> eventually need to be invented.

And that's why we're already fantasizing about Py4k. :)



-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Mercurial migration: help needed

2009-07-05 Thread Martin v. Löwis
In this thread, I'd like to collect things that ought to be done
but where Dirkjan has indicated that he would prefer if somebody else
did it.

So far, I have only one item: build identification. If you want to work
on this, please either provide a patch (for trunk and/or py3k), or
(if you are a committer) create a subversion branch.

It seems that Barry and I agree that for the maintenance branches,
sys.subversion should be frozen, so we need actually two sets of
patches: one that removes sys.subversion entirely, and the other that
freezes the branch to the respective one, and freezes the subversion
revision to None.

Of course, it seems that the actual representation of branches hasn't
been determined yet, so the build process integration may need to be
changed if named branches aren't going to be used in the end.

Anybody working on this should have good knowledge of the Python source
code, Mercurial, and either autoconf or Visual Studio (preferably both).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial migration: help needed

2009-07-05 Thread Daniel Diniz
Hi Martin,

"Martin v. Löwis" wrote:
> In this thread, I'd like to collect things that ought to be done
> but where Dirkjan has indicated that he would prefer if somebody else
> did it.
>
> So far, I have only one item: build identification. If you want to work
> on this, please either provide a patch (for trunk and/or py3k), or
> (if you are a committer) create a subversion branch.

I do want to help, but I believe I'll only have time a week from now.
If we need/want Roundup tweaks to go with Mercurial, I can work on
that (keep in mind we have a cool GSoC student working on
Mercurial-Roundup integration, and I'm willing to work on our needs
with him).

For build identification, I'd only be able to do the C/Python side of
things and (with luck) autoconf.

Cheers,
Daniel
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial migration: progress report (PEP 385)

2009-07-05 Thread Nick Coghlan
Martin v. Löwis wrote:
> What will need debate and discussion in the PEP is the workflow, ie.
> the order in which changes should be applied to the branches.

Particularly since there isn't even "one true workflow" *now*. I see
three main variants go by on Python-checkins:

- svnmerge based
  - commit to 2.x
  - backport to 2.6 with svnmerge
  - forward port to 3.x with svnmerge
  - backport to 3.0 (now 3.1) with svnmerge

- manual port based
  - as above, but without using svnmerge

- Py3k focused
  - commit to 3.x
  - manual backport to 2.x
  - possible svnmerge block of the backported 2.x commit
  - possible svnmerge based or manual backport to 2.6/3.1

While it would obviously be *nice* if every committer maintained 4
checkouts and either blocked or committed each change on the appropriate
branches, I think actually *requiring* a specific workflow has the
potential to cost us commits. Even if one workflow is designated the
'preferred' approach, the others still need to be supported to handle
various possibilities for the "first" commit for a given change:

- forward port only
  - commit to 2.6
  - forward port to 2.x
  - forward port to 3.1
  - forward port to 3.x

- backport only
  - commit to 3.x
  - backport to 3.1
  - backport to 2.x
  - backport to 2.6

- mixed, starting with 2.x (aka current svnmerge workflow)
  - commit to 2.x
  - backport to 2.6
  - forward port to 3.x
  - backport to 3.1

- mixed, starting with 3.1
  - commit to 3.1
  - forward port to 3.x
  - backport to 2.6
  - forward port or backport to 2.x

Note that there are actual multiple variations even of the above
workflows, based on the *source* of the various forward ports and
backports. Also, each "forward port" or "backport" can be replaced by
blocking the merge rather than applying it.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Nick Coghlan
Tarek Ziadé wrote:
> 2009/7/5 P.J. Eby :
>>  If you want a
>> clearer name, '.pydist' or some such would at least be reasonably specific.
>>  (It'd still have a backward compatibility problem, but at least then
>> there'd be some benefit to the name change.)
> 
> I do find "DIST_NAME.metadata" well-named and specific. But I guess that's 
> just
> bikeshedding :)

pydist has the advantage of both being more intuitive than 'egginfo'
while still making it clear that this is Python related metadata rather
than, say, something added by an OS packaging utility. So no, I don't
think it's bikeshedding:

'metadata': accurate but generic. Not clear that it relates to Python
specifically (except that it happens to stored in a Python related
directory)

'egginfo': accurate, specific and serves as a good mnemonic. However,
use of the 'egg' jargon means that someone is unlikely to guess what it
means without being told, and it's less obvious that this is Python related

'pydist': accurate, specific and without the disadvantages of the 'egg'
jargon

PJE points out that existing tools (setuptools, pip, etc) won't be
compatible with the new format at all if it uses a new name, but I am
having trouble seeing that as a *bad* thing. By using a new name for the
directory we *guarantee* that old packaging utilities won't get confused
by the new format (they simply won't acknowledge its existence).

So +1 for pydist as the directory extension in PEP 376 (for the above
reasons).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Paul Moore
2009/7/5 P.J. Eby :
> At 03:13 PM 7/5/2009 +0200, Tarek Ziadé wrote:
>>
>> The loader would still work with "DIST_NAME.egg-info" directories for
>> compatibility with
>> existing format in the query APIs, but the Distutils install command
>> would rather create  "DIST_NAME.metadata"
>
> Note that this would then break setuptools without adding any benefit;
> ".metadata" is less precise and less unique than '.egg-info'.  If you want a
> clearer name, '.pydist' or some such would at least be reasonably specific.
>  (It'd still have a backward compatibility problem, but at least then
> there'd be some benefit to the name change.)

Personally, the filename doesn't bother me. It's the API name that I'd
like changed (particularly as the contraction "egginfo" is clumsy in
function names...)

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Ben Finney
Nick Coghlan  writes:

> 'pydist': accurate, specific and without the disadvantages of the 'egg'
> jargon
> 
> PJE points out that existing tools (setuptools, pip, etc) won't be
> compatible with the new format at all if it uses a new name, but I am
> having trouble seeing that as a *bad* thing. By using a new name for the
> directory we *guarantee* that old packaging utilities won't get confused
> by the new format (they simply won't acknowledge its existence).
> 
> So +1 for pydist as the directory extension in PEP 376 (for the above
> reasons).

+1 for the same reasons. Thank you for expressing them.

-- 
 \ “We are not gonna be great; we are not gonna be amazing; we are |
  `\   gonna be *amazingly* amazing!” —Zaphod Beeblebrox, _The |
_o__)Hitch-Hiker's Guide To The Galaxy_, Douglas Adams |
Ben Finney

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread P.J. Eby

At 08:43 PM 7/5/2009 +0200, Tarek Ziadé wrote:

But if it's based on PEP 302 protocols and if the pkgutil code works
with the sys.meta_path hook,
setuptools could then provide its loader, based on its EggFormats and
act as a provider without being broken.


You misunderstand me.  The whole point of putting .egg-info in 
distutils in the first place was to enable setuptools to detect the 
presence of disutils-installed packages.  That's what's broken by 
changing the name.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread P.J. Eby

At 07:10 AM 7/6/2009 +1000, Nick Coghlan wrote:

By using a new name for the
directory we *guarantee* that old packaging utilities won't get confused
by the new format (they simply won't acknowledge its existence).


This is incorrect; they will get confused because they will think 
that the relevant package is *not* installed, and proceed to install 
a duplicate.  That's why .egg-info was added to the stdlib in the first place.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial migration: help needed

2009-07-05 Thread Martin v. Löwis
> I do want to help, but I believe I'll only have time a week from now.
> If we need/want Roundup tweaks to go with Mercurial, I can work on
> that (keep in mind we have a cool GSoC student working on
> Mercurial-Roundup integration, and I'm willing to work on our needs
> with him).

I think that's straight-forward. Once we know what URLs to link to,
we just need to fix the regexps.

> For build identification, I'd only be able to do the C/Python side of
> things and (with luck) autoconf.

Ok, we'll see in a week from now whether anybody had volunteered.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread Tarek Ziadé
But as I said earlier, if we use a PEP 302-like loader, distutils will
be able to consume
several loaders, so setuptools will be able to provide its storage
strategy (naming and egg dir locations)

So I don't understand why you are saying that it will be incompatible
or get confused.


2009/7/6 P.J. Eby :
> At 07:10 AM 7/6/2009 +1000, Nick Coghlan wrote:
>>
>> By using a new name for the
>> directory we *guarantee* that old packaging utilities won't get confused
>> by the new format (they simply won't acknowledge its existence).
>
> This is incorrect; they will get confused because they will think that the
> relevant package is *not* installed, and proceed to install a duplicate.
>  That's why .egg-info was added to the stdlib in the first place.
>
>



-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com