Re: [Python-Dev] Importing .pyc in -O mode and vice versa
[Off-list] Brett Cannon wrote: [...] > > Hopefully my import rewrite is flexible enough that people will be able > to plug in their own importer/loader for the filesystem so that they can > tune how things like this are handled (e.g., caching what files are in a > directory, skipping bytecode files, etc.). > I just wondered whether you plan to support other importers of the PEP 302 style? I have been experimenting with import from database, and would like to see that work migrate to your rewrite if possible. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path object design
Michael Urman writes:
> Ah, but how do you know when that's wrong? At least under ftp:// your
> root is often a mid-level directory until you change up out of it.
> http:// will tend to treat the targets as roots, but I don't know that
> there's any requirement for a /.. to be meaningless (even if it often
> is).
ftp and http schemes both have authority ("host") components, so the
meaning of ".." path components is defined in the same way for both by
section 5 of RFC 3986.
Of course an FTP server is not bound to interpret the protocol so as
to mimic URL semantics. But that's a different question.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path object design
Steve:
> > I'm darned if I know. I simply know that it isn't right for http resources.
/F:
> the URI specification disagrees; an URI that starts with "../" is per-
> fectly legal, and the specification explicitly states how it should be
> interpreted.
I have looked at the spec, and can't figure out how its explanation
matches the observed urljoin results. Steve's excerpt trimmed out
the strangest example.
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../../../")
'http://blah.com/../'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../../../..") # What?!
'http://blah.com/'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../../../../")
'http://blah.com/../../'
>>>
> (it's important to realize that "urijoin" produces equivalent URI:s, not
> file names)
Both, though, are "paths". The OP, Mik Orr, wrote:
I agree that supporting non-filesystem directories (zip files,
CSV/Subversion sandboxes, URLs) would be nice, but we already have a
big enough project without that. What constraints should a Path
object keep in mind in order to be forward-compatible with this?
Is the answer therefore that URLs and URI behaviour should not
place constraints on a Path object becuse they are sufficiently
dissimilar from file-system paths? Do these other non-FS hierarchical
structures have similar differences causing a semantic mismatch?
Andrew
[EMAIL PROTECTED]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of pairing_heap.py?
Hi Martin, Yes, I'm familiar with the heapq module, but it doesn't do all that I'd like. The main functionality I am looking for is the ability to adjust the value of an item in the heap and delete items from the heap. There's a lot of heap applications where this is useful. (I might even say most heap applications!) To support this, the insert method needs to return a reference to an object which I can then pass to adjust_key() and delete() methods. It's extremely difficult to have this functionality with array-based heaps because the index of an item in the array changes as items are inserted and removed. I guess I don't need a pairing heap, but of the pointer-based heaps I've looked at, pairing heaps seem to be the simplest while still having good complexity guarantees. > Anyway, the immediate author of this code is Dan Stutzbach (as > Raymond Hettinger's checkin message says); you probably should > contact him to find out whether the project is still alive. Okay, I'll do that. What needs to be done to move the project along and possibly get a pairing heap incorporated into a future version of python? Best, Paul On 11/4/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Paul Chiusano schrieb: > > I was looking for a good pairing_heap implementation and came across > > one that had apparently been checked in a couple years ago (!). > > Have you looked at the heapq module? What application do you have > for a pairing heap that you can't do readily with the heapq module? > > Anyway, the immediate author of this code is Dan Stutzbach (as > Raymond Hettinger's checkin message says); you probably should > contact him to find out whether the project is still alive. > > Regards, > Martin > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Importing .pyc in -O mode and vice versa
On Sun, Nov 05, 2006, "Martin v. L?wis" wrote: > Greg Ewing schrieb: >> Fredrik Lundh wrote: >>> >>> well, from a performance perspective, it would be nice if Python looked >>> for *fewer* things, not more things. >> >> Instead of searching for things by doing a stat call for each >> possible file name, would it perhaps be faster to read the contents >> of all the directories along sys.path into memory and then go >> searching through that? > > That should never be better: the system will cache the directory > blocks, also, and it will do a better job than Python will. Maybe so, but I recently dealt with a painful bottleneck in Python code caused by excessive stat() calls on a directory with thousands of files, while the os.listdir() function was bogging things down hardly at all. Granted, Python bytecode was almost certainly the cause of much of the overhead, but I still suspect that a simple listing will be faster in C code because of fewer system calls. It should be a matter of profiling before this suggestion is rejected rather than making assertions about what "should" be happening. -- Aahz ([EMAIL PROTECTED]) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 1993 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path object design
On 11/5/06, Andrew Dalke <[EMAIL PROTECTED]> wrote:
>
>I agree that supporting non-filesystem directories (zip files,
>CSV/Subversion sandboxes, URLs) would be nice, but we already have a
>big enough project without that. What constraints should a Path
>object keep in mind in order to be forward-compatible with this?
>
> Is the answer therefore that URLs and URI behaviour should not
> place constraints on a Path object becuse they are sufficiently
> dissimilar from file-system paths? Do these other non-FS hierarchical
> structures have similar differences causing a semantic mismatch?
This discussion has renforced my belief that os.path.join's behavior
is correct with non-initial absolute args:
os.path.join('/usr/bin', '/usr/local/bin/python')
I've used that in applications and haven't found it a burden.
Its behavior with '..' seems justifiable too, and Talin's trick of
wrapping everything in os.path.normpath is a great one.
I do think join should take more care to avoid multiple slashes
together in the middle of a path, although this is really the
responsibility of the platform library, not a generic function/method.
Join is true to its documentation of only adding separators and never
than deleting them, but that seems like a bit of sloppiness. On the
other hand, the filesystems don't care; I don't think anybody has
mentioned a case where it actually creates a path the filesystem can't
handle.
urljoin clearly has a different job. When we talked about extending
path to URLs, I was thinking more in terms of opening files, fetching
resources, deleting, renaming, etc. rather than split-modify-rejoin.
A hypothetical urlpath module would clearly have to follow the URL
rules. I don't see a contradition in supporting both URL joining
rules and having a non-initial absolute argument, just to avoid
cross-"platform" surprises. But urlpath would also need methods to
parse the scheme and host on demand, query strings, #fragments, a
class method for building a URL from the smallest parts, etc.
As for supporting path fragments and '..' in join arguments (for
filesystem paths), it's clearly too widely used to eliminate. Users
can voluntarily refrain from passing arguments containing separators.
For cases involving a user-supplied -- possibly hostile -- path,
either a separate method (safe_join, child) could achieve this, or a
subclass implemetation that allows only safe arguments.
Regarding pathname-manipulation methods and filesystem-access methods,
I'm not sure how workable it is to have separate objects for them.
os.mkdir( Path("/usr/local/lib/python/Cheetah/Template.py").parent )
Path("/usr/local/lib/python/Cheetah/Template.py").parent.mkdir()
FileAccess(
Path("/usr/local/lib/python/Cheetah/Template.py").parent ).mkdir()
The first two are reasonable. The third... who would want to do this
for every path? How often would you reuse the FileAccess object? I
typically create Path objects from configuration values and keep them
around for the entire application; e.g., data_dir. Then I create
derived paths as necessary. I suppose if the FileAccess object has a
.path attribute, it could do double-duty so you wouldn't have to store
the path separately. Is this what the advocates of two classes have
in mind? With usage like this?
my_file = FileAccess( file_access_obj.path.joinpath("my_file") )
my_file = FileAccess( Path(file_access_obj,path, "my_file") )
Working on my Path implementation. (Yes it's necessary, Glyph, at
least to me.) It's going slow because I just got a Macintosh laptop
and am still rounding up packages to install.
--
Mike Orr <[EMAIL PROTECTED]>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of pairing_heap.py?
"Paul Chiusano" <[EMAIL PROTECTED]> wrote: > > > It is not required. If you are careful, you can implement a pairing > > heap with a structure combining a dictionary and list. > > That's interesting. Can you give an overview of how you can do that? I > can't really picture it. You can support all the pairing heap > operations with the same complexity guarantees? Do you mean a linked > list here or an array? I mean a Python list. The trick is to implement a sequence API that keeps track of the position of any 'pair'. That is, ph[posn] will return a 'pair' object, but when you perform ph[posn] = pair, you also update a mapping; ph.mapping[pair.value] = posn . With a few other bits, one can use heapq directly and get all of the features of the pairing heap API without keeping an explicit tree with links, etc. In terms of running time, adjust_key, delete, and extract(0) are all O(logn), meld is O(min(n+m, mlog(n+m))), empty and peek are O(1), values is O(n), and extract_all is O(nlogn) but uses list.sort() rather than repeatedly pulling from the heap (heapq's documentation suggests this is faster in terms of comparisions, but likely very much faster in terms of actual running time). Attached is a sample implementation using this method with a small test example. It may or may not use less memory than the sandbox pairing_heap.py, and using bare lists rather than pairs may result in less memory overall (if there exists a list "free list"), but this should give you something to start with. - Josiah > Paul > > On 11/4/06, Josiah Carlson <[EMAIL PROTECTED]> wrote: > > > > "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > > Paul Chiusano schrieb: > > > > To support this, the insert method needs to return a reference to an > > > > object which I can then pass to adjust_key() and delete() methods. > > > > It's extremely difficult to have this functionality with array-based > > > > heaps because the index of an item in the array changes as items are > > > > inserted and removed. > > > > > > I see. > > > > It is not required. If you are careful, you can implement a pairing > > heap with a structure combining a dictionary and list. It requires that > > all values be unique and hashable, but it is possible (I developed one > > for a commercial project). > > > > If other people find the need for it, I could rewrite it (can't release > > the closed source). It would use far less memory than the pairing heap > > implementation provided in the sandbox, and could be converted to C if > > desired and/or required. On the other hand, I've found the pure Python > > version to be fast enough for most things I've needed it for. > > > > - Josiah > > > > pair_heap.py Description: Binary data ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path object design
Andrew Dalke schrieb:
> I have looked at the spec, and can't figure out how its explanation
> matches the observed urljoin results. Steve's excerpt trimmed out
> the strangest example.
Unfortunately, you didn't say which of these you want explained.
As it is tedious to write down even a single one, I restrain to the
one with the What?! remark.
urlparse.urljoin("http://blah.com/a/b/c";, "../../../..") # What?!
> 'http://blah.com/'
Please follow me through section 5 of
http://www.ietf.org/rfc/rfc3986.txt
5.2.1: Pre-parse the Base URI
B.scheme = "http"
B.authority = "blah.com"
B.path = "/a/b/c"
B.query = undefined
B.fragment = undefined
5.2.2: Transform References
parse("../../../..")
R.scheme = R.authority = R.query = R.fragment = undefined
R.path = "../../../.."
(strictness not relevant, R.scheme is already undefined)
R.scheme is not defined
R.authority is not defined
R.path is not ""
R.path does not start with /
T.path = merge("/a/b/c", "../../../..")
T.path = remove_dot_segments(T.path)
T.authority = "blah.com"
T.scheme = "http"
T.fragment = undefined
5.2.3 Merge paths
merge("/a/b/c", "../../../..") =
(base URI does have path)
"/a/b/../../../.."
5.2.4 Remove Dot Segments
remove_dot_segments("/a/b/../../../..")
1. I = "/a/b/../../../.."
O = ""
2. A (does not apply)
B (does not apply)
C (does not apply)
D (does not apply)
E O="/a" I="/b/../../../.."
2. E O="/a/b" I="/../../../.."
2. C O="/a" I="/../../.."
2. C O="" I="/../.."
2. C O="" I="/.."
2. C O="" I="/"
2. E O="/" I=""
3. Result: "/"
5.3 Component Recomposition
result = ""
(scheme is defined)
result = "http:"
(authority is defined)
result = "http://blah.com";
(append path)
result = "http://blah.com/";
HTH,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Importing .pyc in -O mode and vice versa
On 11/5/06, Steve Holden <[EMAIL PROTECTED]> wrote: [Off-list]Brett Cannon wrote:[...]>> Hopefully my import rewrite is flexible enough that people will be able> to plug in their own importer/loader for the filesystem so that they can> tune how things like this are handled ( e.g., caching what files are in a> directory, skipping bytecode files, etc.).>I just wondered whether you plan to support other importers of the PEP302 style? I have been experimenting with import from database, and would like to see that work migrate to your rewrite if possible.Yep. The main point of this rewrite is to refactor the built-in importers to be PEP 302 importers so that they can easily be left out to protect imports. Plus I have made sure that doing something like .ptl files off the filesystem is simple (a subclass with a single method overloaded) or introducing a DB as a back-end store (should only require the importer/loader part; can even use an existing class to handle whether bytecode should be recreated or not). Since a DB back-end is a specific use-case I even have notes in the module docstring stating how I would go about doing it.-Brett ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Importing .pyc in -O mode and vice versa
Aahz schrieb: > Maybe so, but I recently dealt with a painful bottleneck in Python code > caused by excessive stat() calls on a directory with thousands of files, > while the os.listdir() function was bogging things down hardly at all. > Granted, Python bytecode was almost certainly the cause of much of the > overhead, but I still suspect that a simple listing will be faster in C > code because of fewer system calls. It should be a matter of profiling > before this suggestion is rejected rather than making assertions about > what "should" be happening. That works both ways, of course: whoever implements such a patch should also provide profiling information. Last time I changed the importing code to reduce the number of stat calls, I could hardly demonstrate a speedup. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path object design
Martin:
> Unfortunately, you didn't say which of these you want explained.
> As it is tedious to write down even a single one, I restrain to the
> one with the What?! remark.
>
> urlparse.urljoin("http://blah.com/a/b/c";, "../../../..") # What?!
> > 'http://blah.com/'
The "What?!" is in context with the previous and next entries. I've
reduced it to a simpler case
>>> urlparse.urljoin("http://blah.com/";, "..")
'http://blah.com/'
>>> urlparse.urljoin("http://blah.com/";, "../")
'http://blah.com/../'
>>> urlparse.urljoin("http://blah.com/";, "../..")
'http://blah.com/'
Does the result make sense to you? Does it make
sense that the last of these is shorter than the middle
one? It sure doesn't to me. I thought it was obvious
that there was an error; obvious enough that I didn't
bother to track down why - especially as my main point
was to argue there are different ways to deal with
hierarchical/path-like schemes, each correct for its
given domain.
> Please follow me through section 5 of
>
> http://www.ietf.org/rfc/rfc3986.txt
The core algorithm causing the "what?!" comes from
"reduce_dot_segments", section 5.2.4. In parallel my
3 cases should give:
5.2.4 Remove Dot Segments
remove_dot_segments("/..")r_d_s("/../")r_d_s("/../..")
1. I = "/.." I="/../"I="/../.."
O = "" O=""O=""
2A. (does not apply) 2A. (does not apply) 2A. (does not apply)
2B. (does not apply) 2B. (does not apply) 2B. (does not apply)
2C. O="" I="/" 2C. O="" I="/"2C. O="" I="/.."
2A. (does not apply) 2A. (does not apply) .. reduces to r_d_s("/..")
2B. (does not apply) 2B. (does not apply) 3. Result "/"
2C. (does not apply) 2C. (does not apply)
2D. (does not apply) 2D. (does not apply)
2E. O="/", I="" 2E. O="/", I=""
3. Result: "/" 3. Result "/"
My reading of the RFC 3986 says all three examples should
produce the same result. The fact that my "what?!" comment happens
to be correct according to that RFC is purely coincidental.
Then again, urlparse.py does *not* claim to be RFC 3986 compliant.
The module docstring is
"""Parse (absolute and relative) URLs.
See RFC 1808: "Relative Uniform Resource Locators", by R. Fielding,
UC Irvine, June 1995.
"""
I tried the same code with 4Suite, which does claim compliance, and get
>>> import Ft
>>> from Ft.Lib import Uri
>>> Uri.Absolutize("..", "http://blah.com/";)
'http://blah.com/'
>>> Uri.Absolutize("../", "http://blah.com/";)
'http://blah.com/'
>>> Uri.Absolutize("../..", "http://blah.com/";)
'http://blah.com/'
>>>
The text of it's Uri.py says
This function is similar to urlparse.urljoin() and urllib.basejoin().
Those functions, however, are (as of Python 2.3) outdated, buggy, and/or
designed to produce results acceptable for use with other core Python
libraries, rather than being earnest implementations of the relevant
specs. Their problems are most noticeable in their handling of
same-document references and 'file:' URIs, both being situations that
come up far too often to consider the functions reliable enough for
general use.
"""
# Reasons to avoid using urllib.basejoin() and urlparse.urljoin():
# - Both are partial implementations of long-obsolete specs.
# - Both accept relative URLs as the base, which no spec allows.
# - urllib.basejoin() mishandles the '' and '..' references.
# - If the base URL uses a non-hierarchical or relative path,
#or if the URL scheme is unrecognized, the result is not
#always as expected (partly due to issues in RFC 1808).
# - If the authority component of a 'file' URI is empty,
#the authority component is removed altogether. If it was
#not present, an empty authority component is in the result.
# - '.' and '..' segments are not always collapsed as well as they
#should be (partly due to issues in RFC 1808).
# - Effective Python 2.4, urllib.basejoin() *is* urlparse.urljoin(),
#but urlparse.urljoin() is still based on RFC 1808.
In searching the archives
http://mail.python.org/pipermail/python-dev/2005-September/056152.html
Fabien Schwob:
> I'm using the module urlparse and I think I've found a bug in the
> urlparse module. When you merge an url and a link
> like"../../../page.html" with urljoin, the new url created keep some
> "../" in it. Here is an example :
>
> >>> import urlparse
> >>> begin = "http://www.example.com/folder/page.html";
> >>> end = "../../../otherpage.html"
> >>> urlparse.urljoin(begin, end)
> 'http://www.example.com/../../otherpage.html'
Guido:
> You shouldn't be giving more "../" sequences than are possible. I find
> the current behavior acceptable.
(Aparently for RFC 1808 that's a valid answer; it was an implementation
choice in how to handle that case.)
While not directly relevant, postings like John J Lee's
http://mail.python.org/pipermail/python-bugs-lis
Re: [Python-Dev] Path object design
Andrew Dalke schrieb:
urlparse.urljoin("http://blah.com/";, "..")
> 'http://blah.com/'
urlparse.urljoin("http://blah.com/";, "../")
> 'http://blah.com/../'
urlparse.urljoin("http://blah.com/";, "../..")
> 'http://blah.com/'
>
> Does the result make sense to you? Does it make
> sense that the last of these is shorter than the middle
> one? It sure doesn't to me. I thought it was obvious
> that there was an error;
That wasn't obvious at all to me. Now looking at the
examples, I agree there is an error. The middle one
is incorrect;
urlparse.urljoin("http://blah.com/";, "../")
should also give 'http://blah.com/'.
>> You shouldn't be giving more "../" sequences than are possible. I find
>> the current behavior acceptable.
>
> (Aparently for RFC 1808 that's a valid answer; it was an implementation
> choice in how to handle that case.)
There is still some text left to that respect in 5.4.2 of RFC 3986.
> While not directly relevant, postings like John J Lee's
> http://mail.python.org/pipermail/python-bugs-list/2006-February/031875.html
>> The urlparse.urlparse() code should not be changed, for
>> backwards compatibility reasons.
>
> strongly suggest a desire to not change that code.
This is John J Lee's opinion, of course. I don't see a reason not to fix
such bugs, or to update the implementation to the current RFCs.
> As this is not a bug, I have added the feature request 1591035 to SF
> titled "update urlparse to RFC 3986". Nothing else appeared to exist
> on that specific topic.
Thanks. It always helps to be more specific; being less specific often
hurts. I find there is a difference between "urllib behaves
non-intuitively" and "urllib gives result A for parameters B and C,
but should give result D instead". Can you please add specific examples
to your report that demonstrate the difference between implemented
and expected behavior?
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Mini Path object
Mike Orr wrote: > .abspath() > .normpath() > .realpath() > .splitpath() > .relpath() > .relpathto() Seeing as the whole class is about paths, having "path" in the method names seems redundant. I'd prefer to see terser method names without any noise characters in them. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Importing .pyc in -O mode and vice versa
Martin v. Löwis wrote: > That should never be better: the system will cache the directory > blocks, also, and it will do a better job than Python will. If that's really the case, then why do discussions of how improve Python startup speeds seem to focus on the number of stat calls made? Also, cacheing isn't the only thing to consider. Last time I looked at the implementation of unix file systems, they mostly seemed to do directory lookups by linear search. Unless that's changed a lot, I have a hard time seeing how that's going to beat Python's highly-tuned dictionaries. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path object design
Me [Andrew]: > > As this is not a bug, I have added the feature request 1591035 to SF > > titled "update urlparse to RFC 3986". Nothing else appeared to exist > > on that specific topic. Martin: > Thanks. It always helps to be more specific; being less specific often > hurts. So does being more specific. I wasn't trying to report a bug in urlparse. I figured everyone knew the problems existed. The code comments say so and various back discussions on this list say so. All I wanted to do what point out that two seemingly similar problems - path traversal of hierarchical structures - had two different expected behaviors. Now I've spent entirely too much time on specifics I didn't care about and didn't think were important. I've also been known to do the full report and have people ignore what I wrote because it was too long. > I find there is a difference between "urllib behaves > non-intuitively" and "urllib gives result A for parameters B and C, > but should give result D instead". Can you please add specific examples > to your report that demonstrate the difference between implemented > and expected behavior? No. I consider the "../" cases to be unimportant edge cases and I would rather people fixed the other problems highlighted in the text I copied from 4Suite's Uri.py -- like improperly allowing a relative URL as the base url, which I incorrectly assumed was legit - and that others have reported on python-dev, easily found with Google. If I only add test cases for "../" then I believe that that's all that will be fixed. Given the back history of this problem and lack of followup I also believe it won't be fixed unless someone develops a brand new module, from scratch, which will be added to some future Python version. There's probably a compliance suite out there to use for this sort of task. I hadn't bothered to look as I am no more proficient than others here at Google. Finally, I see that my report is a dup. SF search is poor. As Nick Coghlan reported, Paul Jimenez has a replacement for urlparse. Summarized in http://www.python.org/dev/summary/2006-04-01_2006-04-15/ It was submitted in spring as a patch - SF# 1462525 at http://sourceforge.net/tracker/index.php?func=detail&aid=1462525&group_id=5470&atid=305470 which I didn't find in my earlier searching. Andrew [EMAIL PROTECTED] ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] idea for data-type (data-format) PEP
Travis Oliphant wrote: > In NumPy, the data-type objects have function pointers to accomplish all > the things NumPy does quickly. If the datatype object is to be extracted and made a stand-alone feature, that might need to be refactored. Perhaps there could be a facility for traversing a datatype with a user-supplied dispatch table? -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Feature Request: Py_NewInterpreter to create separate GIL (branch)
On Nov 4, 2006, at 3:49 AM, Martin v. Löwis wrote: > Notice that at least the following objects are shared between > interpreters, as they are singletons: > - None, True, False, (), "", u"" > - strings of length 1, Unicode strings of length 1 with ord < 256 > - integers between -5 and 256 > How do you deal with the reference counters of these objects? > > Also, type objects (in particular exception types) are shared between > interpreters. These are mutable objects, so you have actually > dictionaries shared between interpreters. How would you deal with > these? All these should be dealt with by making them per-interpreter singletons, not per address space. That should be simple enough, unfortunately the margins of this email are too small to describe how. ;) Also it'd be backwards incompatible with current extension modules. James ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Feature Request: Py_NewInterpreter to create separate GIL (branch)
On 11/5/06, James Y Knight <[EMAIL PROTECTED]> wrote: > > On Nov 4, 2006, at 3:49 AM, Martin v. Löwis wrote: > > > Notice that at least the following objects are shared between > > interpreters, as they are singletons: > > - None, True, False, (), "", u"" > > - strings of length 1, Unicode strings of length 1 with ord < 256 > > - integers between -5 and 256 > > How do you deal with the reference counters of these objects? > > > > Also, type objects (in particular exception types) are shared between > > interpreters. These are mutable objects, so you have actually > > dictionaries shared between interpreters. How would you deal with > > these? > > All these should be dealt with by making them per-interpreter > singletons, not per address space. That should be simple enough, > unfortunately the margins of this email are too small to describe > how. ;) Also it'd be backwards incompatible with current extension > modules. I don't know how you define simple. In order to be able to have separate GILs you have to remove *all* sharing of objects between interpreters. And all other data structures, too. It would probably kill performance too, because currently obmalloc relies on the GIL. So I don't see much point in continuing this thread. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Feature Request: Py_NewInterpreter to create separate GIL (branch)
Guido van Rossum wrote: > I don't know how you define simple. In order to be able to have > separate GILs you have to remove *all* sharing of objects between > interpreters. And all other data structures, too. It would probably > kill performance too, because currently obmalloc relies on the GIL. Nitpick: You have to remove all sharing of *mutable* objects. One day, when we get "pure" GC with no refcounting, that will be a meaningful distinction. :) -- Talin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Importing .pyc in -O mode and vice versa
Greg Ewing schrieb: >> That should never be better: the system will cache the directory >> blocks, also, and it will do a better job than Python will. > > If that's really the case, then why do discussions > of how improve Python startup speeds seem to focus > on the number of stat calls made? A stat call will not only look at the directory entry, but also look at the inode. This will require another disk access, as the inode is at a different location of the disk. > Also, cacheing isn't the only thing to consider. > Last time I looked at the implementation of unix > file systems, they mostly seemed to do directory > lookups by linear search. Unless that's changed > a lot, I have a hard time seeing how that's > going to beat Python's highly-tuned dictionaries. It depends on the file system you are using. An NTFS directory lookup is a B-Tree search; NT has not been doing linear search since its introduction 15 years ago. Linux only recently started doing tree-based directories with the introduction of ext4. However, Linux' in-memory directory cache (the dcache) doesn't need to scan over the directory block structure; not sure whether it uses linear search still. For a small directory, the difference is likely negligible. For a large directory, the cost of reading in the entire directory might be higher than the savings gained from not having to search it. Also, if we do our own directory caching, the question is when to invalidate the cache. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path object design
Andrew Dalke schrieb: >> I find there is a difference between "urllib behaves >> non-intuitively" and "urllib gives result A for parameters B and C, >> but should give result D instead". Can you please add specific examples >> to your report that demonstrate the difference between implemented >> and expected behavior? > > No. > > I consider the "../" cases to be unimportant edge cases and > I would rather people fixed the other problems highlighted in the > text I copied from 4Suite's Uri.py -- like improperly allowing a > relative URL as the base url, which I incorrectly assumed was > legit - and that others have reported on python-dev, easily found > with Google. It still should be possible to come up with examples for these as well, no? For example, if you pass a relative URI as the base URI, what would you like to see happen? > If I only add test cases for "../" then I believe that that's all that > will be fixed. That's true. Actually, it's probably not true; it will only get fixed if some volunteer contributes a fix. > Finally, I see that my report is a dup. SF search is poor. As > Nick Coghlan reported, Paul Jimenez has a replacement for urlparse. > Summarized in > http://www.python.org/dev/summary/2006-04-01_2006-04-15/ > It was submitted in spring as a patch - SF# 1462525 at > > http://sourceforge.net/tracker/index.php?func=detail&aid=1462525&group_id=5470&atid=305470 > which I didn't find in my earlier searching. So do you think this patch meets your requirements? This topic (URL parsing) is not only inherently difficult to implement, it is just as tedious to review. Without anybody reviewing the contributed code, it's certain that it will never be incorporated. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
