Let me begin by apologizing to Christian as I was too snippy in my reply, and sounded even snippier than I meant to.
Christian Stapfer wrote: > "Scott David Daniels" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] >>a "better" set implementation will win if >>it can show better performance without >>related down-sides. > > Is there ever such a thing? I always > thought that, most of the time, there > is no such thing as a free lunch in If you look at the history of Python's sort, it has steadily gotten better. The list implementations has been tweaked to produce better performance appending and popping. There are a number of such cases. In fact, as Python rolls along, code keeps getting improved. Usually the requirement is that the change not degrade current benchmarks and provide a substantial improvement in at least some practical cases. >>As to the "either now, or eventually;" if you _must_ have performance >>now, not in some abstract future, then it behooves you to _test_, >>_test_, _test_! > > Well, I might want to test: but *first* I want > to design, preferably in "armchair style" ... [using] > rough complexity measures to .... >>>>>If the documentation stated the order-of-magnitude >>>>>behavior of those basic operations up front, then >>>>>I (and *anyone* else who ever wanted to use those >>>>>operations on large lists / large sets) could do >>>>>a quick order-of-magnitude estimation of how >>>>>a certain program design will behave, performance >>>>>wise. I think this is where I started over-reacting. There are a number of requests here over time by people who state that things would be so much better for every one if only someone (never themselves) did some more work that they might not otherwise want to do. The people who implement the code often do so on their own time. I find the Python docs surprisingly good for even commercial documentation. For work that is done gratis, it is phenomenal. I hesitate to ask any more of it (although I have pointed out bugs in docs as I've found them). >>And, if the proper documentation is in place, and it >>says "dictionary lookup is O(N)" (and you avoid using >>it for exactly that reason), how surprised will you be >>to discover that the O(N) is only reached if the hash >>values of the keys are all equal? > > It's not that difficult to distinguish > *average* case and *worst* case scenarios. > It might even be a good idea to state, > if that can easily be done, what the > *best* case happens do be... > >>Oh, maybe you expect "O(N)" to really mean "\Theta(N)". >>Then, if you are a dweeb like me, you will respond that >>"This is not possible, a dictionary of size N must take at >>least 'O(lg N)' to read the key, never mind processing it." >>But, it turns out, that given a bound on the size of a >>process, processing an address is "O(1)", not "O(lg N)". >>Is that too practical for you, or not enough? > > I do not expect a developer to expend *inordinate* > amounts of work to figure out the computational > complexity of what he has implemented. But, as I > wrote, he *must* have thought about the matter, and > is thus, by definition, in a rather good position > to provide what he can frequently simply pull from > memory when it comes to documenting the interface > of his module. I talked about Big-O (worst case) or Big-Theta (average case) just to point out that no simple characterization like "O(N)" tells enough of the story to be practically useful. Once you decide that isn't good enough, the burden on creating the documentation is getting substantial, especially given that you've already spent the effort to write the code and tests for it. In fact I'd hesitate to raise the documentation bar any higher -- I'd hate to think someone thought, "Oh, forget it, I'll just use this code myself." >>>>> *Experimenting* is not necessarily as easy to >>>>>do as you seem to believe. No, "experimenting" is not always easy (though often it is easy enough). However, "experimenting" puts the cost on the person who derives the benefit, and is thus likely to not be done in a slipshod way. > I am not at all a lawyer type, as you seem to > imagine. I just want to suggest that some > (to the implementer - but not the average user) > *easily* available information about computational > complexity (especially for the most basic data types) > would be a good thing to have. Nothing more. My point is that simple performance characterization is not good enough to be useful, and fully accurate characterization is an onerous documentation burden. Something in between will be fraught with complaints about "surprising" worst cases. Whether you would approach middle-ground documentation with the spirit of "this is enough to go on" or not, rest assured that a number of people will complain in a "language- lawyer-like" way about any perceived imperfections. >>Would you mind if the quality is proportional to >>the price you paid? Here I was too snippy. Sorry. >>>>You are, of course, either assuming that there's a >>>>single implementation of Python, >>>>or that all implementations have the same behaviour. (Each line denied with "Of course not.") But realistically, most users will learn the performance characteristics once, and continue to use the trade-offs that were characteristics of that version of Python going forward. Now maybe you behave differently, but most programmers I know (and I explicitly include myself) do have that tendency. Lots of python code moves around operating systems and implementations (and the number of implementations is growing). >>>But it is reasonable, anyway, to ask for information >>>about a specific implementation (that one is *forced* >>>to use). >>You are not _forced_ to use any implementation of Python. > What I wanted to say is ... this: *whenever* a Python > program is run, it will run on some specific > implementation of Python. That much should be obvious. If I were still in a snippy mood, I'd point out that you might look at your original statement and see how it might be read by someone who accidentally read what you wrote, rather than what wanted to write. >>You are free to implement your own Python system. > Don't be ridiculous. If Pypy gets going this may not be as unlikely as you think. You will be able to (relatively easily) replace implementations of parts of Python and re-generate a system. > It is you (and only you) here, who is arguing that if > perfection is not attainable (with reasonable effort), > nothing should be done at all. I am trying to say the request you make is substantially larger than you understand (or would appear at first blush). Further it is a request that asks for work from others for your (and other's) benefit, not work that you offer to pitch in and help on. > I am as willing to live with imperfect module inferface > specification as I am willing with imperfect tools more > generally. But that will not stop me from making > *suggestions* for improvement (not arrogant demands, mind you). Try writing something that would fit your standards for all of the basic types, punting where you don't know details. If it is enough to be useful, others will chip in and help fix it where it is broken. > I have seen *many* books on algorithms that specify > computational complexity measures for the algorithms > described. Any really good library of algorithms will > at least try to reach that standard, too. How many "really good libraries of algorithms" do you know of? Could you name a couple that have these characterizations? Practical code is rife with issues of "finite address space," low-order effects swamping the higher-order terms for most practical sizes and such. It is this collection of nasty little details that makes characterizing libraries frustratingly hard. >>>I consider it the job of the implementer to ... >>Am I to take it you provide this to all of your customers? > ...I did not want to pose as the customer who expects nothing > but perfection from *others* (but not himself, of course). > I am not at all like that: you are attacking > a figment of your imagination. Actually, I was not attacking at all. I was trying to suggest that it is a harder task than you imagine. I am assuming you won't take me up on the suggestion of starting a flawed document, but I'd be happy to be proved wrong. If not, try writing a characterization of the performance of a dozen or so of your own programs. Does the result seem useful in proportion to the work it took you to write it? >>>How reasonable is it to ask me, or anyone else >>>for that matter, to extract, experiment-wise >>>(if it can be done at all with reasonable effort) >>>information that properly belongs to the implementer >>>and really should have been exposed in the >>>documentation in the first place? >>Not at all reasonable. How reasonable is it to ask >>me to provide you support information for free? OK, here I was _very_ snippy. Sorry again. I will assert that often the experiment results in a single bit of information: "it is fast enough." Experimenting will get you reliable answers like that, and only when run-time is an issue will you need to go into it much further. --Scott David Daniels [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list