Re: "no variable or argument declarations are necessary."

2005-10-07 Thread Alex Martelli
Antoon Pardon <[EMAIL PROTECTED]> wrote:
   ...
> >>   egold = 0:
> >>   while egold < 10:
> >> if test():
> >>   ego1d = egold + 1
> >> 
> >
> > Oh come on. That is a completely contrived example,
> 
> No it is not. You may not have had any use for this
> kind of code, but unfamiliary with certain types
> of problems, doesn't make something contrived.

It's so contrived it will raise a SyntaxError due to the spurious extra
colon on the first line;-).

Or, consider, once the stray extra colon is fixed:

Helen:/tmp alex$ cat ap.py
def ap():
   egold = 0
   while egold < 10:
 if test():
   ego1d = egold + 1

Helen:/tmp alex$ pychecker ap.py
Processing ap...

Warnings...

ap.py:4: No global (test) found
ap.py:5: Local variable (ego1d) not used
Helen:/tmp alex$ 

If you're so typo-prone and averse to unittests that you consider this
kind of issue to be a serious problem, just use pychecker and get
informed about any such typo, just as above.

Incessant whining about the non-existent advantages of declarations,
rather than the simple use of tools that can diagnose such spelling
mistakes without any need for declarations, would qualify you as a troll
even if you didn't have a long history of trolling this group...

> Names do get misspelled and sometimes that misspelling is hard to spot.

It's totally trivial, of course, as shown above, and there is no need to
pervert and distort the language for the purpose, as you, troll, have
kept whining about for years.  I'm partial to pychecker -- that's what
we use at Google, and we also, incidentally, recently had the good
fortune to hire Neal Norwitz, pychecker's author; but there are several
other free tools that perform similar tasks, albeit with very different
philosophy, such as Logilab's pylint...:

Helen:/tmp alex$ pylint ap.py
No config file found, using default configuration
* Module ap
W:  2: Bad indentation. Found 3 spaces, expected 4
W:  3: Bad indentation. Found 3 spaces, expected 4
W:  4: Bad indentation. Found 5 spaces, expected 8
W:  5: Bad indentation. Found 7 spaces, expected 12
C:  0: Too short name "ap"
W:  0: Missing docstring
W:  0: Missing required attribute "__revision__"
C:  1:ap: Too short name "ap"
W:  1:ap: Missing docstring
E:  4:ap: Undefined variable 'test'
W:  5:ap: Unused variable 'ego1d'

  [rest of long critique of ap.py snipped]

Again, unused variables (typos...) get easily diagnosed without any need
for declarations.  (Similar tools, of course, apply to languages
requiring declaration, to diagnose a variable that's declared but
unused, which is a very bad code smell typical of such languages).  Of
course, pylint is about enforcing all sort of code rules, such as, by
default, indentation by multiples of 4 spaces, name length, docstrings,
and so on; while pychecker is much simpler and more narrowly aimed at
diagnosing likely mistakes and serious code smells.

But, with either tool or any of many others, there is no need at all for
declarations in order to catch typos (of course, unittests are still a
VERY good idea -- catching all typos and even coding rules violations is
NO guarantee that your code is any good, testing is A MUST).


> > It would give the 
> > programmer a false sense of security since they 'know' all their 
> > misspellings are caught by the compiler. It would not be a substitute for
> > run-time testing.
> 
> I don't think anyone with a little bit of experience will be so naive.

Heh, right.  After all, _I_, for example, cannot have even "a little bit
of experience" -- after all, I've been programming for just 30 years
(starting with my freshman year in university), and anyway all I have to
show for that is a couple of best-selling books, and a stellar career
culminating (so far) with my present job as Uber Technical Lead for
Google, Inc, right here in Silicon Valley... no doubt Google's reaching
over the Atlantic to come hire me from Italy, and the US government's
decision to grant me a visa under the O-1 category (for "Aliens with
Outstanding Skills"), were mere oversights on their part that,
obviously, I cannot have even "a little bit of experience", given that I
(like great authors such as Bruce Eckel and Robert Martin) entirely
agree with the opinion you deem "so naive"... that any automatic
catching of misspellings can never be a substitute for unit-testing!


Ah well -- my good old iBook's settings had killfiles for newsreaders,
with not many entries, but yours, Antoon, quite prominent and permanent;
unfortunately, that beautiful little iBook was stolen
(http://www.papd.org/press_releases/8_17_05_fix_macs_211.html), so I got
myself a brand new one (I would deem it incorrect to use for personal
purposes the nice 15" Powerbook that Google assigned me), and it takes
some time to reconstruct all the settings.  But, I gotta get started
sometime -- so, welcome, o troll, as the very first entry in my
brand-new killfile.

In other words: *PLONK*, troll!-)


Alex
-- 
http://mail.python.org/m

Re: "no variable or argument declarations are necessary."

2005-10-08 Thread Alex Martelli
Paul Rubin <http://[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] (Alex Martelli) writes:
> > ap.py:4: No global (test) found
> > ap.py:5: Local variable (ego1d) not used
> > Helen:/tmp alex$ 
> > 
> > If you're so typo-prone and averse to unittests that you consider this
> > kind of issue to be a serious problem, just use pychecker and get
> > informed about any such typo, just as above.
> 
> That's very helpful, but why isn't it built into Python?

Because some users will prefer to use a different approach to checking,
for example, such as pylint (much more thorough in enforcing coding
rules and checking for all sort of things) or nothing (much faster than
pychecker, which in turn is faster than pylint).  Just as for other
programming tools, such as, say, an editor, I think it's wise to avoid
excessive and premature standardization on one specific tool to the
detriment of others.  (IDLE is "bundled with" Python, but not _built
into_ it -- indeed some would claim that the bundling was too much).

Not all tools need evolve at the same speed as the core language, which
currently follows a wise policy of "major" releases (2.3, 2.4, etc)
about 18 to 24 months apart, and NO feature changes for "point" release
(2.4.2 has exactly the same features as 2.4.1 -- it just fixes more
bugs).  Any tool which gets built into python (or, less strictly but
still problematically, is separate but bundled with it) must get on
exactly the same schedule and policy as Python itself, and that is
definitely not something that's necessarily appropriate.

If you're worried about the end-users which can't be bothered to
download tools (and, for that matter, libraries) separately from the
main language, the solution is "sumo releases" -- Enthought Python (from
Enthought) being an extreme example, but Active Python (from
ActiveState) comes with quite a few bundled add-ons, too.  I believe
that Linux has proven the validity of this general model: having the
"core" (mostly the kernel, in Linux's case; the language and standard
library, in Python's) evolve and get released as its own speed, and
having _distributions_ bundling the core with different set of tools and
add-ons get released on THEIR preferred schedules, independently.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Will python never intend to support private, protected and public?

2005-10-08 Thread Alex Martelli
Simon Brunning <[EMAIL PROTECTED]> wrote:

> On 9/28/05, Steven D'Aprano <[EMAIL PROTECTED]> wrote:
> > > If *real* private and protected are *enforced*, Python will be the
> > > poorer for it. See
> > > .
> >
> > That's a wonderful, if long, essay.
> 
> That's the Martellibot for you. Never use a word where a paragraph
> with explanatory footnotes will do.
> 
> Sigh. I miss him on c.l.py.

Why, thanks -- it's sweet to be missed!-)  I also had fun rereading that
little piece of mine...

Unfortunately my current presence is probably a somewhat short-lived
phenomenon (I just need a breather right now, but I'll have to plunge
back into Google work AND writing the 2nd edition of the Nutshell
soon...), but, for a short while, I'm back!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Will python never intend to support private, protected and public?

2005-10-08 Thread Alex Martelli
Kay Schluehr <[EMAIL PROTECTED]> wrote:

> Honestly I like to use private/protect/public modifiers in C++ for the
> sake of code documentation. I like to know which attributes are
> dedicated to be known by other objects, which ones are for internal use
> only and which ones should be at least publicly accessible within a
> class hierarchy. This helps structuring code in the large and spotting
> attention. Code becomes easier accessible. But if we have Sunday or I

This advisory role is played in Python by naming conventions.  Start
attribute names with a single underscore to suggest "private but may be
easily overridden by subclasses" (roughly the equivalent of "protected";
Stroustrup is on record, in his book "The Design and Evolution of the
C++ Programming Language", as regretting the exact details whereby
"protected" became entrenched in C++, and wishing they could be
changed... I believe Python's single-underscore is somewhat better), two
underscores if you ever want to make names hard to override (I used to
like that, but as time goes by have come to like it less and less; right
now, unless I have to respect existing coding standards, I entirely
avoid the double-underscore usage, while single-underscores are OKish).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extending Python

2005-10-08 Thread Alex Martelli
Micah Elliott <[EMAIL PROTECTED]> wrote:

> On Oct 05, Tuvas wrote:
> > I am looking for a good tutorial on how to extend python with C
> > code. I have an application built in C that I need to be able to use
> > in Python. I have searched through various sources, starting of
> > course with the Python site itself, and others, but I felt a bit
> > lacking from the Python site, it seems it was only made for those
> > who installed the source distribution, as for the other people...
> > Anyways, thanks for the help!
> 
> I have no experience with this, but I see that Alex Martelli's "Python
> In A Nutshell" has quite a few pages on the subject.

I also covered the same subject in a more tutorial (but less deep and
extended) way in articles for "Py" magazine, but I don't know if those
old issues of it are still in print.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging sorted lists/iterators/generators into one stream of values...

2005-10-08 Thread Alex Martelli
George Sakkis <[EMAIL PROTECTED]> wrote:

> "Lasse Vågsæther Karlsen" <[EMAIL PROTECTED]> wrote:
> 
> > Thanks, that looks like Mike's solution except that it uses the
> > built-in heapq module.
> 
> This make a big difference for the algorithmic complexity; replacing an
> item in a heap is much more efficient than sorting the whole list.

In the most general case, yes.  However, Python's sort ("timsort") is
preternaturally fast when sorting sequences that are mostly sorted
except for maybe one element being in the wrong place... try it (and
read the Timbot's article included in Python's sources, and the sources
themselves)...  I suspect that heapq will still be faster, but by
nowhere as much as one might think.


> Yes, it's a little inconvenient that the builtin heap doesn't take a
> comparison operation but you can easily roll your own heap by transforming
> each item to a (key,item) tuple. Now that I'm thinking about it, it might
> be a good addition to the cookbook.

I believe Python 2.5 adds a key= argument to heapq's functions...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Google Not Universal Panacea [was: Re: Where to find python c-sources]

2005-10-08 Thread Alex Martelli
Steve Holden <[EMAIL PROTECTED]> wrote:
   ...
>  >> Are people really too lazy to do elementary research on Google?
> 
> goes a bit too far in imputing motives to the enquirer and overlooking
> the fact that there are some very good reasons for *not* using Google.

It's a good thing you don't actually name any of those reasons, tho:-).

> we're talking male hormones here, since by and large women don't appear
> to have embraced the Python community (except perhaps individually, but
> that's no business of mine).

Anna seems to be doing fine, though.  She's currently taking a C class
at college and claims "the more I know C, the more I love Python" - and
I gather she's evangelizing (and the class is about 50/50 genderwise;-).

> Also, many regular readers didn't grow up speaking English (I was 

Yep -- I'm one example of that.  Didn't stop Google from hiring me,
though;-).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Feature Proposal: Sequence .join method

2005-10-08 Thread Alex Martelli
Michele Simionato <[EMAIL PROTECTED]> wrote:
   ...
> I have noticed a while ago that inside generators StopIteration is
> automatically trapped, i.e.
> 
> def g():
> yield 1
> raise StopIteration
> yield "Never reached"
> 
> only yields 1. Not sure if this is documented behavior, however, of if
> it is an implementation
> accident. Anybody who knows?

It yields 1 and on the next call to .next() raises StopIteration, which
is the way an iterator says it's done -- so, it yields 1 and then it's
done.  I'm not sure what you mean by "inside generators ...
automatically trapped", or what's undocumented about that.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python for search engine development

2005-10-08 Thread Alex Martelli
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> Well, Google applies some Python in their implementation, see
> http://www-db.stanford.edu/~backrub/google.html

"Some" is correct.  As for writing a search engine in Python _only_,
hmmm -- I honestly don't know.  You could surely develop a working
implementation, but then, to make it perform well, you'd most likely
want to profile it and retune some CPU-intensive parts using pyrex, or
C.

So, if during your program development process you can find good
open-source C or C++ libraries offering a fast implementation of some of
the CPU-bound stuff you know you'll need, you would probably be better
off by wrapping those libraries (again using pyrex, or maybe SWIG, or
Boost Python for C++, ...) rather than redoing them from scratch in
Python (and probably later having to do some tuning on those parts).
One notably strong point of Python is that it "plays well with others",
and I would advise you to leverage this fact.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging sorted lists/iterators/generators into one stream of values...

2005-10-09 Thread Alex Martelli
George Sakkis <[EMAIL PROTECTED]> wrote:
   ...
> > manipulation of a heap to place an item in the right spot, but with 4-5
> > or a few more sources might not make an impact at all.
> 
> Unless you're talking about hundreds or thousands sources, it probably
> won't. I would still go for the heap solution since IMO the resulting
> code it's more readable and easier to understand.

I'm not so sure about either sentence...:

Helen:~/pynut/samp alex$ python merger.py --numstreams=10 --minlen=100
--how=S
Best time for 10 loops: 0.247116088867

Helen:~/pynut/samp alex$ python merger.py --numstreams=10 --minlen=100
--how=H
Best time for 10 loops: 0.10344004631

i.e., a heap solution may be over 4 times faster than a sort-based one
(in the following implementations).  Readability seems quite comparable
(skipping the rest of the infrastructure, which generates random sorted
streams and ensures a stream is exhausted and verifies it etc etc):

def merge_by_sort(streams):
  sources = [[s.next(), i, s.next] for i, s in enumerate(streams)]
  while sources:
sources.sort(reverse=True)
best_source = sources[-1]
yield best_source[0]
try: best_source[0] = best_source[-1]()
except StopIteration: sources.pop()

def merge_by_heap(streams):
  sources = [[s.next(), i, s.next] for i, s in enumerate(streams)]
  heapq.heapify(sources)
  while sources:
best_source = sources[0]
yield best_source[0]
try: best_source[0] = best_source[-1]()
except StopIteration: heapq.heappop(sources)
else: heapq.heapreplace(sources, best_source)


Hmmm, I wonder if something like merge_by_heap would be a good candidate
for itertool.  Raymond...?


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about parsing a string

2005-10-10 Thread Alex Martelli
Nico Grubert <[EMAIL PROTECTED]> wrote:

> Hi there,
> 
> I would like to parse a string in Python.
> 
> If the string is e.g. '[url=http://www.whatever.org][/url]' I would like
> to generate this string:
> 'http://www.whatever.org";>http://www.whatever.org'
> 
> If the string is e.g. '[url=http://www.whatever.org]My link[/url]' I 
> would like to generate this string:
> 'http://www.whatever.org";>My link'
> 
> Any idea how I can do this? Maybe with regular expressions?

If you know the string always starts with '[url=' and ends with '[/url]'
(or, any string not thus starting/ending are to be skipped, etc), REs
are a bit of an overkill (they'll work, but you can do it more simply).

If your actual needs are different, you'll have to express them more
explicitly.  But assuming the "given starting and ending" scenario:

_start = '[url='
_startlen = len(_start)
_end = '[/url]'
_endlen = len(_end)
def doit(s):
  if s[:_startlen] != _start: raise ValueError
  if s[-_endlen:] != _end: raise ValueError
  where_closebracket = s.index(']')
  url = s[_startlen:where_closebracket]
  txt = s[where_closebracket+1:-_endlen]
  if not txt: txt = url
  return '%s' % (url, txt)

I've just typed in this code without trying it out, but roughly it
should be what you want.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What about letting x.( ... ? ... ) be equivalent to ( ... x ... )

2005-10-10 Thread Alex Martelli
al <[EMAIL PROTECTED]> wrote:

> And it solve a problem that in all object oriented langages, a method
> that process 2 or more different classes of objets belongs just to one
> of those classes.

Your use of the word "all" in the phrase "all object oriented languages"
is erroneous.  There ARE several object-oriented languages which solve
this issue neatly and elegantly by using "multi-methods".  The most
easily accessible of those is probably still Dylan; see
 for more

I don't believe that Python will ever have multi-methods (any more than
I expect to see them in Java, C++ or C#), but that's no reason to forget
them:-).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New Python book

2005-10-10 Thread Alex Martelli
hrh1818 <[EMAIL PROTECTED]> wrote:

> This book is not a new book. It is an updated version of Magnus's  2002
> Practical Python book.

Then it's probably a good book, because Practical Python sure was!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging sorted lists/iterators/generators into one stream of values...

2005-10-10 Thread Alex Martelli
George Sakkis <[EMAIL PROTECTED]> wrote:
   ...
> > i.e., a heap solution may be over 4 times faster than a sort-based one
> > (in the following implementations).
> 
> Interesting; I thought timsort on small almost ordered lists would be
> practically as fast as the heap. Still, how is 0.10344 over 4 times faster
> than 0.247116 ?

Oops, 2.5 times (for 10 streams) -- the ratio keeps getting bigger with
the number of streams, the figure 4 I had in mind was probably from
comparisons with 50 streams or so.

timsort needs N comparisons even if the list is already ordered (just to
check it is) while heaps can do with log(N) comparisons or even less in
the best case -- that's the key issue here.

> >   sources = [[s.next(), i, s.next] for i, s in enumerate(streams)]
   ...
> Indeed, these are almost equivalent as far as readability goes; the
> previous examples in the thread were less clear. By the way, why do you
> need 'i' and enumerate above ?

Making sure that, if the first items are identical, the sorting (or
comparison for heap) does not compare the _methods_ -- no big deal if it
does, but that doesn't logically make sense.  If we had key= arguments,
that would ENSURE no such comparison, but heaps didn't support that in
2.4 and I wanted to keep the playing field level, so I used the above
old trick to ensure stable sorting without comparisons beyond a given
field (more details were explained in the 1st edition of the Cookbook,
but I hope I give the general idea).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging sorted lists/iterators/generators into one stream of values...

2005-10-10 Thread Alex Martelli
Mike C. Fletcher <[EMAIL PROTECTED]> wrote:
   ...
> One thing to keep in mind (if you care about performance) is that you
> one could use bisect, instead of sort, as the sorted list of streams is
> already in order save for the one element you are processing.  Btw, nice
> trick with reverse to reduce memory copying, when did that get 
> introduced?

Python 2.4

>  Wonder if bisect can deal with reverse-sorted elements.  

Not AFAIK.

> Anyway, should reduce the O-complexity of that part of the operation,
> though you'll still have to do a memcpy to shift the rest of the source
> list's array, and if it can't deal with reverse-sorted lists it would
> move you back to front-of-list popping.

Yeah, if you have to pop the front you're O(N) anyway, which is
basically what timsort does with nearly-ordered lists (though timsort
uses O(N) _comparisons_ while bisecting would use O(N) _moves_).


> Oh, we're still missing use of a comparison function in both versions.

Trivial, just have the functions accept a key= and use that to prepare
the auxiliary list they're using anyway.

> I'd think you'd want that functionality *available* if you're going to
> make this a general tool.  You'll also need to check for StopIteration
> on creation of sources for null sequences.

True, the functions as prepared don't accept empty streams (exactly
because they don't specialcase StopIteration on the first calls to
next).  Pretty trivial to remedy, of course.

> Finally, why  the 'i' 
> element?  It's never used AFAICS.

It's used by the list's lexicographic comparison when the first elements
of two lists being compared are equal (which can happen, of course) to
avoid comparing the last elements (which are methods... no big deal if
they get compared, but it makes no logical sense).

So, an example enhanced merge_by_sort:

 
def merge_by_sort(streams, key=None):

  if not key: key = lambda x: x
  sources = []
  for i, s in enumerate(streams):
try: first_item = s.next()
except StopIteration: pass
else: sources.append((key(item), i, item, s.next))

  while sources:
sources.sort(reverse=True)
best_source = sources[-1]
yield best_source[2]
try: best_source[2] = best_source[-1]()
except StopIteration: sources.pop()
else: best_source[0] = key(best_source[2])


Of course, since the sort method DOES accept a key= parameter, this
could be simplified, but I'll leave it like this to make it trivial to
see how to recode the merging by heap as well (in 2.4)...


Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A problem while using urllib

2005-10-11 Thread Alex Martelli
Johnny Lee <[EMAIL PROTECTED]> wrote:
   ...
>try:
>   webPage = urllib2.urlopen(url)
>except urllib2.URLError:
   ...
>webPage.close()
>return True
> 
> 
>But every time when I ran to the 70 to 75 urls (that means 70-75
> urls have been tested via this way), the program will crash and all the
> urls left will raise urllib2.URLError until the program exits. I tried
> many ways to work it out, using urllib, set a sleep(1) in the filter (I
> thought it was the massive urls crashed the program). But none works.
> BTW, if I set the url from which the program crashed to base url, the
> program will still crashed at the 70-75 url. How can I solve this
> problem? thanks for your help

Sure looks like a resource leak somewhere (probably leaving a file open
until your program hits some wall of maximum simultaneously open files),
but I can't reproduce it here (MacOSX, tried both Python 2.3.5 and
2.4.1).  What version of Python are you using, and on what platform?
Maybe a simple Python upgrade might fix your problem...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to do *args, **kwargs properly

2005-10-11 Thread Alex Martelli
Lasse Vågsæther Karlsen <[EMAIL PROTECTED]> wrote:
   ...
> fn(1, 2, 3)
> fn(1, 2, 3, cmp=lambda x, y: y-x)
> fn(1, 2, 3, cpm=lambda x, y: y-x) # TypeError on this

I assume these are your specs.

> or is the "proper python" way simply this:
> 
> def fn(*values, **options):
>  if "cmp" in options: comparison = options["cmp"]
>  else: comparison = cmp
>  # rest of function here
> 
> and thus ignoring the wrong parameter names?

Errors should not pass silently, unless explicitly silenced.

So, I would code:

def fn(*values, **opts):
  cmp = opts.pop('cmp', cmp)
  if opts: raise ValueError, 'Unknown option(s): %s' % opts.keys()
  # rest of function here

There are some cases where ignoring extra options, or just warning about
them, may be more appropriate, but normally I would check...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for info on Python's memory allocation

2005-10-11 Thread Alex Martelli
Steven D'Aprano <[EMAIL PROTECTED]> wrote:
   ...
> > s = [k for k in iterable]
> > 
> > if I know beforehand how many items iterable would possibly yield, would
> > a construct like this be faster and "use" less memory?
> > 
> > s = [0] * len(iterable)
> > for i in xrange(len(iterable)):
> >  s[i] = iterable.next()
> 
> Faster? Maybe. Only testing can tell -- but I doubt it. But as for less

Testing, of course, is trivially easy:

Helen:/tmp alex$ python -mtimeit -s'i=xrange(7)' 'L=[x for x in i]'
10 loops, best of 3: 6.65 usec per loop

Helen:/tmp alex$ python -mtimeit -s'i=xrange(7)' 'L=list(i)' 
10 loops, best of 3: 5.26 usec per loop

So, using list() instead of that silly list comprehension does have an
easily measurable advantage.  To check the complicated option...:

Helen:/tmp alex$ python -mtimeit -s'i=xrange(7)' '
s = [0]*7
u = iter(i)
for x in xrange(7):
  s[x] = u.next()
'
1 loops, best of 3: 24.7 usec per loop

So, the "advantage" of all the complications is to slow down execution
by about four times -- net of all the juicy possibilities for confusion
and errors (there is no common Python type on which you can both call
len(...) AND the .next() method, for example -- a combination which
really makes no sense).

*SIMPLICITY* is the keyword here -- list(...) is by far simplest, and
almost five times faster than that baroque gyrations above...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-11 Thread Alex Martelli
Tom Anderson <[EMAIL PROTECTED]> wrote:
   ...
> Has anyone looked into using a real GC for python? I realise it would be a

If you mean mark-and-sweep, with generational twists, that's what gc
uses for cyclic garbage.

> lot more complexity in the interpreter itself, but it would be faster,
> more reliable, and would reduce the complexity of extensions.

???  It adds no complexity (it's already there), it's slower, it is, if
anything, LESS reliable than reference counting (which is way simpler!),
and (if generalized to deal with ALL garbage) it might make it almost
impossible to write some kinds of extensions (ones which need to
interface existing C libraries that don't cooperate with whatever GC
collection you choose).  Are we talking about the same thing?!


> So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
> Fair enough - the performance gain is nice, but the extra complexity would
> be a huge pain, i imagine.

CPython currently is implemented on a strict "minimize all tricks"
strategy.  There are several other implementations of the Python
language, which may target different virtual machines -- Jython for JVM,
IronPython for MS-CLR, and (less mature) stuff for the Parrot VM, and
others yet from the pypy project.  Each implementation may use whatever
strategy is most appropriate for the VM it targets, of course -- this is
the reason behind Python's refusal to strictly specify GC semantics
(exactly WHEN some given garbage gets collected)... allow such multiple
implementations leeway in optimizing behavior for the target VM(s).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Adding a __filename__ predefined attribute to 2.5?

2005-10-12 Thread Alex Martelli
Rune Strand <[EMAIL PROTECTED]> wrote:

> > those modules are already imported when Python gets to your code, so
> > the only "overhead" you're saving is a little typing.
> 
> I don't understand this. Could you please elaborate?  - if sys or os
> are not imported for any other causes how are they already imported?

The "other causes" are always present, since these modules include
functionality Python always needs -- a fact that is no big secret,
either.  For example: you know (I assume and hope) that Python's import
statement finds files to import along directories (and zipfiles) listed
in sys.path -- so how do you think 'sys' itself can possibly get
'imported' in the first place, since the import mechanism depends on one
of sys's attributes...?  Answer: sys is a built-in module, compiled into
the Python interpreter itself.  There are several, see
sys.builtin_file_names for a list.  'os' is a slightly different case --
it's not built-in, but gets imported anyway during startup because other
modules need it anyway.  In my build of Python 2.4.1, there are 16
built-in modules, and 9 others that aren't built-in but get imported at
startup -- check sys.modules on your version for the total number of
modules that are in memory by the time any code of yours runs.  (This is
with -S to inhibit Python from reading the site.py module, otherwise you
might get more... but never, I believe, could you get fewer).

> Maybe I'm wrong here, and accessing the filesystem and reading the
> module into memory represents no cost. My mistake, in that case.

Your mistake is due to a different case: the filesystem access and
reading have ALREADY happened (for os; for sys, as I explained, the
mechanism is different).  Therefore, an import does NO such access and
reading -- it just gets the module object from (e.g.) sys.modules['os'].


> > wow.  that's one lousy optimization...
> 
> > here's a *shorter* piece of code, which is also readable and portable, and
> > a lot easier to type on most keyboards:
> 
> >   import os
> >  __filename__ = os.path.basename(__file__)
> 
> It may be lousy, but it requires no imports. And, as I said in the

The point is that there's no substantial advantage to "requiring no
imports".  Python, if anything, already has too many built-ins -- once
backwards compatibility can be broken (i.e., in 3.0), many of them
should be moved to standard library modules, "requiring imports" (which
are cheap operations anyway).  The desire for MORE built-ins with no
real advantage (since "requiring no imports" ISN'T a true advantage) is
definitely misplaced.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Adding a __filename__ predefined attribute to 2.5?

2005-10-13 Thread Alex Martelli
Rune Strand <[EMAIL PROTECTED]> wrote:

> Ok, Alex. I know a good explanation when I see one. Thanks!

You're welcome!  I've tried to give good (but shorter!-) explanations in
the Nutshell, too, but of course it's easier to aim a specific
explanation to a specific questioner than to try and clarify
"everything" for "everybody" (particularly because, when posting, I'm
not forced to be concise as I am when writing books or articles, so I
can aim more relentlessly for completeness and precision...).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Alex Martelli
Tom Anderson <[EMAIL PROTECTED]> wrote:

> On Tue, 11 Oct 2005, Alex Martelli wrote:
> 
> > Tom Anderson <[EMAIL PROTECTED]> wrote:
> >   ...
> >> Has anyone looked into using a real GC for python? I realise it would be a
> >
> > If you mean mark-and-sweep, with generational twists,
> 
> Yes, more or less.
> 
> > that's what gc uses for cyclic garbage.
> 
> Do you mean what python uses for cyclic garbage? If so, i hadn't realised

Yes, gc (a standard library module) gives you access to the mechanism
(to some reasonable extent).

> that. There are algorithms for extending refcounting to cyclic structures
> (i forget the details, but you sort of go round and experimentally 
> decrement an object's count and see it ends up with a negative count or
> something), so i assumed python used one of those. Mind you, those are
> probably more complex than mark-and-sweep!

Not sure about that, when you consider the "generational twists", but
maybe.


> >> lot more complexity in the interpreter itself, but it would be faster,
> >> more reliable, and would reduce the complexity of extensions.
> >
> > ???  It adds no complexity (it's already there), it's slower,
> 
> Ah. That would be why all those java, .net, LISP, smalltalk and assorted
> other VMs out there, with decades of development, hojillions of dollars
> and the serried ranks of some of the greatest figures in computer science
> behind them all use reference counting rather than garbage collection,
> then.
> 
> No, wait ...

Not everybody agrees that "practicality beats purity", which is one of
Python's principles.  A strategy based on PURE reference counting just
cannot deal with cyclic garbage -- you'd also need the kind of kludges
you refer to above, or a twin-barreled system like Python's.  A strategy
based on PURE mark-and-sweep *CAN* be complete and correct... at the
cost of horrid delays, of course, but what's such a practical
consideration to a real purist?-)

In practice, more has probably been written about garbage collection
implementations than about almost every issue in CS (apart from sorting
and searching;-).  Good techniques need to be "incremental" -- the need
to "stop the world" for unbounded amounts of time (particularly in a
paged virtual memory world...), typical of pure m&s (even with
generational twists), is simply unacceptable in all but the most "batch"
type of computations, which occupy a steadily narrowing niche.
Reference counting is intrinsically "reasonably incremental"; the
worst-case of very long singly-linked lists (such that a dec-to-0 at the
head causes a cascade of N dec-to-0's all along) is as rare in Python as
it is frequent in LISP (and other languages that go crazy with such
lists -- Haskell, which defines *strings* as single linked lists of
characters, being a particularly egregious example) [[admittedly, the
techniques for amortizing the cost of such worst-cases are well known in
any case, though CPython has not implemented them]].

In any case, if you like Python (which is a LANGUAGE, after all) and
don't like one implementation of it, why not use a different
implementation, which uses a different virtual machine?  Jython, for the
JVM, and IronPython, for MSCLR (presumably what you call ".net"), are
quite usable; project pypy is producing others (an implementation based
on Common LISP was one of the first practical results, over a year ago);
not to count Parrot, and other projects yet...


> > it is, if anything, LESS reliable than reference counting (which is way
> > simpler!),
> 
> Reliability is a red herring - in the absence of ill-behaved native 
> extensions, and with correct implementations, both refcounting and GC are
> perfectly reliable. And you can rely on the implementation being correct,
> since any incorrectness will be detected very quickly!

Not necessarily: tiny memory leaks in supposedly "stable" versions of
the JVM, for example, which get magnified in servers operating for
extremely long times and on very large scales, keep turning up.  So, you
can't count on subtle and complicated implementations of garbage
collection algorithms being correct, any more than you can count on that
for (for example) subtle and complicated optimizations -- corner cases
can be hidden everywhere.

There are two ways to try to make a software system reliable: make it so
simple that it obviously has no bugs, or make it so complicated that it
has no obvious bugs.  RC is definitely tilted towards the first of the
two options (and so would be mark-and-sweep in the pure form, the one
where you may need to stop everything for a LONG time once in a while),
while more sophisticated GC schemes get more and more complicat

Re: confusion between global names and instantiated object variable names

2005-10-14 Thread Alex Martelli
wanwan <[EMAIL PROTECTED]> wrote:
   ...
> when I run my example, an error shows:
> "NameError: global name'menubar' is not defined"
> 
> I wonder why it doesn't work.  Isn't that the way to define an object
> variable?  

The code you posted should not trigger this error.  Most likely problem:
you have typed a comma where you meant to type a dot, for example
instead of self.menubar you wrote self,menubar somewhere -- it's a hard
error to spot with certain fonts.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem splitting a string

2005-10-15 Thread Alex Martelli
Steven D'Aprano <[EMAIL PROTECTED]> wrote:
   ...
> You can *almost* do that as a one-liner:

No 'almost' about it...

> L2 = [item.split('_') for item in mystr.split()]
> 
> except that gives a list like this:
> 
> [['this', 'NP'], ['is', 'VL'], ['funny', 'JJ']]
> 
> which needs flattening. 

because the flattening is easy:

[ x for x in y.split('_') for y in z.split(' ') ]


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem splitting a string

2005-10-15 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:
   ...
> A third alternative is to split once, then split the substrings a
> second time and stitch the results back together:
> 
> >>> sum([x.split('_') for x in mystr.split()], [])
> ['this', 'NP', 'is', 'VL', 'funny', 'JJ']
> 
> Which is probably slow. To bad extend doesn't take multiple arguments.

Using sum on lists is DEFINITELY slow -- avoid it like the plague.

If you have a list of lists LOL, DON'T use sum(LOL, []), but rather

[x for x in y for y in LOL]


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Some set operators

2005-10-15 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:

> Sometimes I suggest to add things to the language (like adding some set
> methods to dicts), but I've seen that I tend to forget the meaning of
> six set/frozenset operators:
> 
> s & t  s &= t
> s | t  s |= t
> s ^ t  s ^= t
> 
> My suggestion is to remove them, and keep them only as explicit
> non-operator versions (.symmetric_difference(), .update(),
> .intersection_update(), etc). But maybe now it's too much late to
> remove them... Maybe someone gentle can explain me the advantage of
> having/keeping them.

Helen:~ alex$ python2.4 -mtimeit -s's1=s2=set()' 's1&s2'
100 loops, best of 3: 0.929 usec per loop

Helen:~ alex$ python2.4 -mtimeit -s's1=s2=set()' 's1.intersection(s2)'
100 loops, best of 3: 1.28 usec per loop

Besides avoiding the need for a name look-up, and thus extracting a tiny
speed-up of about 0.35 microseconds or so, I can't think of advantages
for the infix operator form (but then, I can't think of any advantages
for type *int* having the same operators, with bitwise-logic semantics,
rather than placing them only in some library module).

I still vaguely hope that in 3.0, where backwards incompatibilities can
be introduced, Python may shed some rarely used operators such as these
(for all types, of course).  As long as the operators are there for
ints, it makes sense to have them apply to sets as well, of course.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Some set operators

2005-10-16 Thread Alex Martelli
Giovanni Bajo <[EMAIL PROTECTED]> wrote:

> Alex Martelli <[EMAIL PROTECTED]> wrote:
> 
> > I still vaguely hope that in 3.0, where backwards incompatibilities
> > can be introduced, Python may shed some rarely used operators such as
> > these (for all types, of course).
> 
> I hope there is no serious plan to drop them. There is nothing wrong in having
> such operators, and I wouldn't flag bit operations as "rarely used". They are
> very common when calling C-based API and other stuff. I know I use them very
> often. They have a clear and well-understood meaning, as they appear identical
> in other languages, including the widely-spread C and C++.

Well, C and C++ don't have unbounded-length integers, nor built-in sets,
so the equivalence is slightly iffy; and the precedence table of
operators in Python is not identical to that in C/C++.  As for frequency
of use, that's easily measured: take a few big chunks of open-source
Python code, starting with the standard library (which does a lot of
"calling C-based API and other stuff") and widespread applications such
as mailman and spambayes, and see what gives.

But the crux of our disagreement lies with your assertion that there's
nothing wrong in having mind-boggling varieties and numbers of
operators, presumably based on the fact that C/C++ has almost as many.

I contend that having huge number of operators (and other built-ins)
goes against the grain of Python's simplicity, makes Python
substantially harder to teach, and presents no substantial advantages
when compared to the alternative of placing that functionality in a
built-in module (possibly together with other useful bit-oriented
functionality, such as counts of ones/zeros, location of first/last
one/zero bit, formatting into binary, octal and hexadecimal, etc).

As for "serious plans", it's been a while since I checked PEP 3000, but
I don't think it addresses this issue one way or another -- yet.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to get a raised exception from other thread

2005-10-17 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:

> Nevermind.  I found a better solution.  I used shared memory to create
> a keep-alive flag.  I then use the select function with a specified
> timeout, and recheck the keep-alive flag after each timeout.

Definitely a better architecture.  Anyway, one supported way for a
thread to raise an exception in a different thread is function
thread.interrupt_main(), which raises a KeyboardInterrupt in the *main*
thread (the one thread that's running at the beginning of your program).

There's also a supported, documented function to raise any given
exception in any existing thread, but it's deliberately NOT directly
exposed to Python code -- you need a few lines of  C-coded extension (or
pyrex, ctypes, etc, etc) to get at the functionality.  This small but
non-null amount of "attrition" was deliberately put there to avoid
casual overuse of a facility intended only to help in very peculiar
cases (essentially in debuggers &c, where the thread's code may be buggy
and fail to check a keep-alive flag correctly...!-).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-17 Thread Alex Martelli
Christian Stapfer <[EMAIL PROTECTED]> wrote:

>  This is why we would like to have a way of (roughly)
> estimating the reasonableness of the outlines of a
> program's design in "armchair fashion" - i.e. without
> having to write any code and/or test harness.

And we would also like to consume vast amounts of chocolate, while
similarly reclining in comfortable armchairs, without getting all fat
and flabby.  Unfortunately, what we would like and what reality affords
are often pretty uncorrelated.  No matter how much theoreticians may
love big-O because it's (relatively) easy to compute, it still has two
failings which are often sufficient to rule out its sufficiency for any
"estimate [of] the reasonableness" of anything: [a] as we operate on
finite machines with finite wordsize, we may never be able reach
anywhere even remotely close to the "asymptotic" region where big-O has
some relationship to reality; [b] in many important cases, the
theoretical worst-case is almost impossible to characterize and hardly
ever reached in real life, so big-O is of no earthly use (and much
harder to compute measures such as big-Theta should be used for just
about any practical purpose).

Consider, for example, point [b].  Quicksort's big-O is N squared,
suggesting that quicksort's no better than bubblesort or the like.  But
such a characterization is absurd.  A very naive Quicksort, picking its
pivot very systematically (e.g., always the first item), may hit its
worst case just as systematically and in cases of practical importance
(e.g., already-sorted data); but it takes just a little extra care (in
the pivot picking and a few side issues) to make the worst-case
occurrences into ones that will not occur in practice except when the
input data has been deliberately designed to damage by a clever and
determined adversary.

Designing based on worst-case occurrences hardly ever makes sense in any
field of engineering, and blind adherence to worst-case assessments can
be an unmitigated disaster, promoting inferior technology just because,
in the WORST imaginable case, the best available technology would fare
no better than the inferior one (even though in 99.9% of cases the
best technology would perform better, if you're designing based on
worst-case analyses you may not even NOTICE that -- and NEVER, *NEVER*
forget that big-O is nothing BUT "extreme-worst-case" analysis!).  Why
bother using prestressed concrete, when, should a large asteroid score a
direct hit, the costly concrete will stand up no better than cheap
bricks, or, for that matter, slightly-damp straw?  Why bother doing
(e.g.) random pivot selection in quicksort, when its big-O (i.e.,
worst-case) behavior will remain N-squared, just like naive quicksort,
or, for that matter, bubblesort?


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Queue question

2005-10-17 Thread Alex Martelli
Steve M <[EMAIL PROTECTED]> wrote:

> According to my "Python in a Nutshell":
> 
> q.get(block=True)
> 
> is the signature, so, as you use it above, the call will hang until
> something is on the queue. If block is false and the queue is empty,
> q.get() will raise the exception Empty.
> 
> q.get_nowait is apparently synonymous with q.get(block=False)

Yep.  Nowadays you can also have an optional timeout= argument to the
.get method, to obtain the Empty exception only after the get attempt
has waited for some time for some item to arrive.


> q.not_empty, if it existed, I expect would be true just in case there
> was at least one item in the queue. But according to my book there is
> q.empty and q.full, which is true when the queue has the maximum
> allowed number of items (a value specified when the queue is created).

not_empty and not_full are not methods but rather instances of the
threading.Condition class, which gets waited on and notified
appropriately.  I'm not entirely sure exactly WHAT one is supposed to do
with the Condition instances in question (I'm sure there is some design
intent there, because their names indicate they're public); presumably,
like for the Lock instance named 'mutex', they can be used in subclasses
that do particularly fiendish things... but I keep planning not to cover
them in the 2nd edition of the Nutshell (though there I _will_ cover the
idea of subclassing Queue to implement queueing disciplines other than
FIFO without needing to worry about synchronization, which I had skipped
in the 1st edition).

 
> Also, I don't think you can rely on q.empty in the way you may expect.
> For example, another thread can empty the queue between the time you
> test whether q.empty is false and the time you call q.get.

Absolutely true.  "Anything can happen" right after you call q.empty(),
so the existence of that method isn't a good idea (it's somewhat of an
"attractive nuisance" -- its existence prompts some programmers who
don't read docs to try and use it, possibly producing unreliable code
which works when tested but silently breaks in real-life use).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with properties

2005-10-17 Thread Alex Martelli
Michael Schneider <[EMAIL PROTECTED]> wrote:

> Thanks to all,  I added the object as a subclass (should this be 
> required for 2.4.1 ???)

It _IS_ required, because Python these days moves *very slowly indeed*
before changing semantics of existing code in any way that is not
backwards compatible -- we just don't want to break good working code,
and there are many millions of lines' worth of such Python code in use
around the world.  The newstyle object model cannot have identical
semantics to the legacy (AKA "classic") one, so it can't become the
default without LOTS AND LOTS of years spent with the old-style mode
being discouraged and deprecated... but STILL the default.

As usual, "should" is a harder question to answer -- one might
reasonably say that maintainers of legacy code have had PLENTY of
warning time by now, and newstyle OM "should" become the default by,
say, 2.6, rather than waiting for 3.0.  But, that's not an easy issue to
call!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: generic xmerge ?

2005-10-17 Thread Alex Martelli
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> I was reading this recipe and am wondering if there is a generic
> version of it floating around ? My list is a tuple (date, v1, v2, v3)
> and I would like it to sort on date. The documentation doesn't mention
> how the items are compared and the example only use integers.
> 
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/141934

I'm not sure what "my list is a tuple" mean (list and tuple being
different types) nor what this has to do with the recipe.  Anyway...
sequences are compared lexicographically -- first items first, then
second items if the first items are equal, and so on.  So, if you have a
list X whose items tuples and want X sorted on the tuples' first items,
X.sort() will suffice -- if the tuples never have equal first-items, or
if you're OK with second-items getting compared when the first-items are
equal.  If you want to sort on first-items ONLY, leaving the tuples in
the same order in the list when their first-items are equal:

import operator
X.sort(key=operator.itemgetter(0))


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: generic xmerge ?

2005-10-17 Thread Alex Martelli
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> oops, sorry. I meant
> 
> l1=[(date,v1,v2,v3), ...]
> l2=[ another set of tuples ]
> 
> Thanks. so I have to concat the multiple lists first(all of them are
> sorted already) ?

You can do it either way -- simplest, and pretty fast, is to concatenate
them all and sort the result (the sort method is very good at taking
advantage of any sorting that may already be present in some parts of
the list it's sorting), but you can also try a merging approach.  E.g.:


import heapq

def merge_by_heap(*lists):
  sources = [[s.next(), i, s.next]
for i, s in enumerate(map(iter,lists))]
  heapq.heapify(sources)
  while sources:
best_source = sources[0]
yield best_source[0]
try: best_source[0] = best_source[-1]()
except StopIteration: heapq.heappop(sources)
else: heapq.heapreplace(sources, best_source)

Now, L=list(merge_by_heap(l1, l2, l3)) should work, just as well as,
say, L = sorted(l1+l2+l3).  I suspect the second approach may be faster,
as well as simpler, but it's best to _measure_ (use the timeit.py module
from the standard library) if this code is highly speed-critical for
your overall application.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Run process with timeout

2005-10-17 Thread Alex Martelli
Natan <[EMAIL PROTECTED]> wrote:

> Hi.
> 
> I have a python script under linux where I poll many hundreds of
> interfaces with mrtg every 5 minutes. Today I create some threads and
> use os.system(command) to run the process, but some of them just hang.
> I would like to terminate the process after 15 seconds if it doesn't
> finish, but os.system() doesn't have any timeout parameter.
> 
> Can anyone help me on what can I use to do this?

Use the subprocess module.  With a subprocess.Popen object, you can then
sleep a while, check (with .poll()) if it's finished, otherwise kill it
(use its .pid attribute).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: generic xmerge ?

2005-10-17 Thread Alex Martelli
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> million thanks. So the default compare funcion of heapq also do it like
> sort ?

By defaults, all comparisons in Python occur by the same mechanisms: by
preference, specific comparison operators such as < , <= , and so on
(corresponding to special methods __lt__, __le__, and so on) -- missing
that, the three-way comparison done by built-in function cmp
(corresponding to speial method __cmp__).  For built-in sequences, in
particular (both tuples and lists), the comparisons are lexicographical.

Some (but not all) occasions that imply comparisons let you specify
something else than the default, for example by such mechanisms as the
cmp= optional argument to .sort (which tends to have unpleasant
performance impacts) and the key= optional argument (which tends to have
good performance impact).  heapq does not offer this feature in today's
Python (i.e., 2.4), but I believe it's planned to have it in the future
2.5 release.

But in your case the default comparisons appear to be OK, so you should
have no problem either way.


> The size of the list is not very large but has the potential of being
> run many times(web apps). So I believe second one should be faster(from
> the app perspective) as it goes into the optimized code quickly without
> all the overheads in the merge case.

Yes, the simpler solution may well perform better.  Note that:

L = list(l1)
L.extend(l2)
L.extend(l3)
L.sort()

may perform better than L = sorted(l1+l2+l3) -- if speed matters a lot,
be sure to try (and measure!) both versions.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dynamic generation of doc-strings of dynamically generated classes

2005-10-17 Thread Alex Martelli
Mikael Olofsson <[EMAIL PROTECTED]> wrote:
   ...
> Any ideas? Am I stuck with the clumsy exec-solution, or are there other
> ways to dynamically generate doc-strings of classes?

The best way to make classes on the fly is generally to call the
metaclass with suitable parameters (just like, the best way to make
instances of any type is generally to call that type):

derived = type(base)('derived', (base,), {'__doc__': 'zipp'})


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reading hebrew text file

2005-10-17 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:

> I have a hebrew text file, which I want to read in python
> I don't know which encoding I need to use & how I do that

As for the "how", look to the codecs module -- but if you don't know
what codec the textfile is written in, I know of no ways to guess from
here!-)


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-17 Thread Alex Martelli
Christian Stapfer <[EMAIL PROTECTED]> wrote:

> "Alex Martelli" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
> > Christian Stapfer <[EMAIL PROTECTED]> wrote:
> >
> >>  This is why we would like to have a way of (roughly)
> >> estimating the reasonableness of the outlines of a
> >> program's design in "armchair fashion" - i.e. without
> >> having to write any code and/or test harness.
> >
> > And we would also like to consume vast amounts of chocolate, while
> > similarly reclining in comfortable armchairs,
> 
> Maybe some of my inclination towards design
> based on suitable *theories* (instead of
> self-conditioning through testing) goes back
> to the fact that I tend to think about the
> design of my programs when no computer happens
> to be near at hand to do some such experimenting,
> or self-conditioning...

Oh, I am as prone as anybody I know to do SW architecture and design in
bed when the lights are off and I'm sliding into sleep -- just about the
only case in which no computer is handy, or, rather, in which it's
generally unwise to turn the computer on (since it would interfere with
the sleep thing;-).  Back before laptops were really affordable and
usable, I used to have a long bus commute, and did a lot of design with
pen and paper; and whiteboards are a popular group-design tool at
Google, no matter how many laptops or desktops happen to be around --
whiteboards are simply more suitable for "socialization" around a draft
design's sketch, than any computer-based tool I've ever seen.

But that's *design*, and most often in pretty early stages, too -- quite
a ways from *coding*.  At that stage, one doesn't even generally commit
to a specific programming language or other for the eventual
implementation of the components one's considering!  Rough ideas of
*EXPECTED* run-times (big-Theta) for various subcomponents one is
sketching are *MUCH* more interesting and important than "asymptotic 
worst-case for amounts of input tending to infinity" (big-O) -- for
example, where I sketch-in (mentally, on paper, or on whiteboard) a
"hash table" subcomponent, I consider the *expected* (Theta) performance
(constant-time lookups), definitely NOT the big-O "linear time" lookups
which just MIGHT occur (if, say, all inputs just happened to hash to the
same value)... otherwise, I'd never use hash tables, right?-)


> > without getting all fat and flabby.
> 
> Well, thinking can be hard work. There is no need
> to suggest an image of laziness. Thought experiments
> are also quite often successful. Hardware engineers
> can design very often entire gadgets without doing
> a great deal of testing. They usually need to resort
> to testing only if they know (or feel?) not to have
> a sufficiently clear *theoretical* grasp of the
> behavior of some part of their design.

Having been a hardware designer (of integrated circuits, for Texas
Instruments, and later briefly for IBM), before switching to software, I
can resolutely deny this assertion: only an utter madman would approve a
large production run of an IC who has not been EXTENSIVELY tested, in
simulations and quite possibly in breadboards and later in limited
pre-production runs.  And any larger "gadget" USING ICs would be
similarly crazy to skimp on prototyping, simulation, and other testing
-- because, as every HW engineer KNOWS (SW ones often have to learn the
hard way), the distance between theory and practice, in practice, is
much larger than the distance between practice and theory should be in
theory;-).


> >  Unfortunately, what we would like and what reality affords
> > are often pretty uncorrelated.  No matter how much theoreticians may
> > love big-O because it's (relatively) easy to compute, it still has two
> > failings which are often sufficient to rule out its sufficiency for any
> > "estimate [of] the reasonableness" of anything: [a] as we operate on
> > finite machines with finite wordsize, we may never be able reach
> > anywhere even remotely close to the "asymptotic" region where big-O has
> > some relationship to reality; [b] in many important cases, the
> > theoretical worst-case is almost impossible to characterize and hardly
> > ever reached in real life, so big-O is of no earthly use (and much
> > harder to compute measures such as big-Theta should be used for just
> > about any practical purpose).
> 
> But the fact remains that programmers, somewhat
> experienced with the interface a module offers,
> have a *rough*idea* of that computational complexity
> attaches to what operations of that interface.
> And having such a *rough*idea* helps them to
&

Re: Run process with timeout

2005-10-17 Thread Alex Martelli
Micah Elliott <[EMAIL PROTECTED]> wrote:

> Is there any way to enable Python's subprocess module to do (implicit?)
> group setup to ease killing of all children?  If not, is it a reasonable
> RFE?

Not as far as I know.  It might be a reasonable request in suitable
dialects of Unix-like OSes, though.  A setpgrp call (in the callback
which you can request Popen to perform, after it forks and before it
execs) might suffice... except that you can't rely on children process
not to setpgrp's themselves, can you?!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: UI toolkits for Python

2005-10-18 Thread Alex Martelli
Paul Rubin  wrote:

> Torsten Bronger <[EMAIL PROTECTED]> writes:
> > Because everybody is capable of running a JS engine, even on
> > computers on which you don't have rights to install something.
> 
> I don't think using JS so heavily without a compelling reason is
> really in the WWW spirit.  Lots of browsers don't have JS.  And lots
> of JS is so annoying that some users like to turn it off even in
> browsers that have it.

I don't have the exact numbers, and I'm pretty certain they'd be
confidential if I did, but I believe the factors you mention (browsers
completely lacking JS, and users turning JS off), *combined*, still
allow JS-rich interfaces to run for well over 95% of visitors to our
sites.  Maybe that's the key difference between the mindset of a
mathematician and that of an engineer -- I consider reaching over 95% of
visitors to be _quite good indeed_, while you appear to disagree because
of "WWW spirit" issues.  Is making a rapidly responsive site (not
requiring roundtrips for every interaction) a "compelling reason"?  It
seems to me that it is -- and why else would one use ANY Javascript,
after all?

My one issue with the JS/AJAX mania is that I really dislike JS as a
language, particularly when you take the mixed mongrel dialect that you
do need to reach all the various browsers and releases needed to make
that 95% goal.  But, alas, there is really no alternative!-(


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: override a property

2005-10-18 Thread Alex Martelli
Robin Becker <[EMAIL PROTECTED]> wrote:

> Bruno Desthuilliers wrote:
> > Robin Becker a écrit :
> > 
> >> Is there a way to override a data property in the instance? Do I need
> >> to create another class with the property changed?
> > 
> > Do you mean attributes or properties ?
> 
> I mean property here. My aim was to create an ObserverProperty class 
> that would allow adding and subtracting of set/get observers. My current
> implementation works fine for properties on the class, but when I need
> to specialize an instance I find it's quite hard.

A property is an 'overriding descriptor', AKA 'data descriptor', meaning
it "captures" assignments ('setattr' kinds of operations), as well as
accesses ('getattr' kinds), when used in a newstyle class.  If for some
reason you need an _instance_ to bypass the override, you'll need to set
that instance's class to one which has no overriding descriptor for that
attribute name.  A better design might be to use, instead of the builtin
type 'property', a different custom descriptor type that is specifically
designed for your purpose -- e.g., one with a method that instances can
call to add or remove themselves from the set of "instances overriding
this ``property''" and a weak-key dictionary (from the weakref module)
mapping such instances to get/set (or get/set/del, if you need to
specialize "attribute deletion" too) tuples of callables.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question on class member in python

2005-10-18 Thread Alex Martelli
Johnny Lee <[EMAIL PROTECTED]> wrote:
   ...
> Thanks for your help, maybe I should learn how to turn an attibute into
> a property first.

Easy -- in your class's body, just code:

  def getFoo(self): ...
  def setFoo(self, value): ...
  def delFoo(self): ...
  foo = property(getFoo, setFoo, delFoo, 'this is the foo')


Note that if you want subclasses to be able to customize behavior of foo
accesses by simple method overrides, you need to program some "hooks"
(an extra level of indirection, if you will).

Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question on class member in python

2005-10-18 Thread Alex Martelli
Johnny Lee <[EMAIL PROTECTED]> wrote:

> But I still wonder what's the difference between the A().getMember and
> A().member besides the style

Without parentheses after it, getMember is a method.  The difference
between a method object and an integer object (which is what member
itself is in your example) are many indeed, so your question is very
strange.  You cannot call an integer, you cannot divide methods, etc.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question on class member in python

2005-10-18 Thread Alex Martelli
Johnny Lee <[EMAIL PROTECTED]> wrote:

> Alex Martelli ???

Now that's a peculiar question...


> > Johnny Lee <[EMAIL PROTECTED]> wrote:
> >
> > > But I still wonder what's the difference between the A().getMember and
> > > A().member besides the style
> >
> > Without parentheses after it, getMember is a method.  The difference
> > between a method object and an integer object (which is what member
> > itself is in your example) are many indeed, so your question is very
> > strange.  You cannot call an integer, you cannot divide methods, etc.
> >
> >
> > Alex
> 
> Sorry, I didn't express myself clear to you. I mean:
> b = A().getMember()
> c = A().member
> what's the difference between b and c? If they are the same, what's the
> difference in the two way to get the value besides the style.

If getMember's body is nothing but a 'return self.member', then there is
no difference -- 'assert b is c'.

What is the difference between:

x = 2

and 

y = 2+2-2*2/2

???  Answer: in terms of final results, no difference.  On the other
hand, the second approach does a lot of obviously useless and intricate
computation, so it's a sheer waste of time and effort.

Exactly the same answer applies to your question -- obtaining the
.member attribute "indirectly", by calling a method that returns it,
does some obviously useless and moderately intricate computation, which
in some ways is a waste of (some) time and effort.  That's all!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: UI toolkits for Python

2005-10-18 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:
   ...
> What surprises me is that marketing types will accept turning away -
> what's the current internet user base? 200 million? - 10 million
> potential customers without a complaint. Or maybe they just don't get
> told that that's what's going on.

In firms where marketing has lots of power, they may indeed well decide
to pursue those "10 millions" by demanding an expenditure of effort
that's totally out of proportion (to the detriment of the other "190
millions", of course, since there IS a finite amount of development
resources to allocate).  Maybe that's part of the explanation for the
outstanding success of some enterprises founded by engineers, led by
engineers, and staffed overwhelmingly with engineers, competing with
other firms where marketing wield power...?


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List of strings to list of floats ?

2005-10-18 Thread Alex Martelli
Madhusudan Singh <[EMAIL PROTECTED]> wrote:
   ...
> >> Say I have two lists of floats. And I wish to generate a list of floats
> >> that is a user defined function of the two lists.
> > 
> > result = [sqrt(x**2 + y**2) for x, y in zip(xs, ys)]
> 
> Works perfectly. Thanks !

If zip works and map doesn't, most likely your problem is that the two
lists have different lengths.  In this case, zip truncates to the
shorter list, while map conceptually extends the shorter list with
copies of None -- causing an error when you try to do arithmetic between
a float towards the end of the longer list, and None...!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: UI toolkits for Python

2005-10-18 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] (Alex Martelli) writes:
> > Mike Meyer <[EMAIL PROTECTED]> wrote:
> >> What surprises me is that marketing types will accept turning away -
> >> what's the current internet user base? 200 million? - 10 million
> >> potential customers without a complaint. Or maybe they just don't get
> >> told that that's what's going on.
> > In firms where marketing has lots of power, they may indeed well decide
> > to pursue those "10 millions" by demanding an expenditure of effort
> > that's totally out of proportion
> 
> What makes you think that the expenditure of effort is "totally out of
> proportion"? In my experience, that isn't the case - at least if you
> go into it planning on doing things that way. Retrofitting a site that
> was built without any thought but "make it work in my favoriter
> browser in my favorite configuration" can be a radically different
> thing.

Why, of course -- coding a site to just one browser would be foolish
(though there exist sites that follow that strategy, it's still
despicable).  What I'm talking about is sites that are _supposed_ to be
able to support a dozen browsers, in three or four versions each, not to
mention a dozen features each of which the user "might" have chosen to
disable (for a total of 2**12 == 4096 possibilities).  Of course, the
site's poor authors cannot possibly have tested the 4096 * 12 * 3.5
possibilities, whence the "_supposed_ to be".

We ARE talking about moving from supporting 95% to supporting
(*supposedly*!) 100%, after all -- very much into the long, *LONG* tail
of obscure buggy versions of this browser or that, which SOME users
within those last centiles may have forgotten to patch/upgrade, etc.
And THAT is what makes the effort totally out of proportion (differently
from the effort to go from 60% to 95%, which, while far from negligible,
is well within sensible engineering parameters).


> > Maybe that's part of the explanation for the
> > outstanding success of some enterprises founded by engineers, led by
> > engineers, and staffed overwhelmingly with engineers, competing with
> > other firms where marketing wield power...?
> 
> You mean like google? Until recently, they're an outstanding example
> of doing things right, and providing functionality that degrades
> gracefully as the clients capabilities go down.

I'm not sure what you mean by "until recently" in this context.  AFAIK,
we've NEVER wasted our efforts by pouring them into the quixotic task of
supporting *100%* of possible browsers that may hit us, with the near
infinite number of combinations of browsers, versions and disabled
feature that this would require.  One may quibble whether the target
percentage should be, say, 93%, 95%, or 97%, and what level of
degradation can still be considered "graceful" around various axes, but
the 100% goal which you so clearly imply above would, in my personal
opinion, be simply foolish now, just as it would have been 3 years ago.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sequence and/or pattern matching

2005-10-19 Thread Alex Martelli
Séb <[EMAIL PROTECTED]> wrote:

> Hi everyone,
> 
> I'm relatively new to python and I want to write a piece of code who do
> the following work for data mining purpose :

Essentially, if I understand correctly, you want to detect LOOPS given a
sequence of directed connections A->B.  "loop detection" and "graph"
would then be the keywords to search for, in this case.

> 2) I want to find if there are unknown sequences of connexions in my
> data and if these sequences are repeated along the file :
> 
> For example :
> 
> Computer A connects to Computer B then
> Computer B connects to Computer C then
> Computer C connects to Computer A

Does this "then" imply you're only interested in loops occurring in this
*sequence*, i.e., is order of connections important?  If the sequence of
directed connections was, say, in the different order:

B->C
A->B
C->A

would you want this detected as a loop, or not?


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sort problem

2005-10-21 Thread Alex Martelli
Michele Petrazzo <[EMAIL PROTECTED]> wrote:

> Lasse Vågsæther Karlsen wrote:
> > How about:
> > 
> > list.sort(key=lambda x: x[3])
> > 
> > Does that work?
> 
> Yes, on my linux-test-box it work, but I my developer pc I don't have
> the 2.4 yet. I think that this is a good reason for update :)

Updating is a good idea, and will let you get even faster by avoiding
the lambda:

import operator

thelist.sort(key=operator.itemgetter(3))

However, until you can upgrade you might be happy enough with a direct
implementation of the decorate-sort-undecorate (DSU) idiom which they
new "key=" named argument to sort implements.  To wit:

aux = [ (x[3], x) for x in thelist ]
aux.sort()
thelist[:] = [ x[-1] for x in aux ]

Note that the "decoration" can include as many "columns" as you want,
transformations obtained by calling int(...) or str(...) on some of the
columns, and so on.  This applies to "key=" in 2.4 just as well as to
the (slightly slower) direct implementation in 2.3 and earlier.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs Ruby

2005-10-21 Thread Alex Martelli
Amol Vaidya <[EMAIL PROTECTED]> wrote:

> Hi. I am interested in learning a new programming language, and have been
> debating whether to learn Ruby or Python. How do these compare and contrast
> with one another, and what advantages does one language provide over the
> other? I would like to consider as many opinions as I can on this matter
> before I start studying either language in depth. Any help/comments are
> greatly appreciated. Thanks in advance for your help. 

Within the panorama of existing languages, Python and Ruby are just
about as similar as two languages independently developed are likely to
get (and then some -- there is clearly some minor mutual influence, e.g.
Ruby probably picked 'def' for function definition partly-because that's
what Python uses, and later Python may have picked 'yield' for
generators partly-because that's what Ruby uses to have a method
interact with a block it gets passed).  They address the same niches and
share most of the same strengths (and minor weaknesses).

Pragmatically, Python is more mature -- with all sort of consequences,
e.g., Python will be faster for many tasks (more time has been spent on
optimizing it), more third-party libraries and tools are available, etc,
but the Python community may be less easy to "get into" than the newer,
smaller Ruby one, for example.

Here's a tiny script showing some similarities and differences:

def f()
  i = 0
  while i < 100
j = 923567 + i
i += 1
  end
end

f()

comment out the 'end' statements, and at colons at the end of the def
and while statements, and this is also valid Python.  On my iBook...:

Helen:~ alex$ time ruby -w tim.rb 

real0m5.367s
user0m5.129s
sys 0m0.041s

while:

Helen:~ alex$ time python tim.py

real0m1.078s
user0m0.953s
sys 0m0.063s

Note that this is NOT the normal way to loop in either language, so do
NOT read too much into the reported times -- a 5:1 ratio is NOT normally
observed on real tasks, though it IS reasonably normal for Python to be
somewhat faster.  BTW, this is Python 2.4.1 and Ruby 1.8.2.

I'm pretty sure that the Ruby community at some point will go through
much the same exercise the Python one did in the several years spent on
the transitions 2.2 -> 2.3 -> 2.4 -- reduce the amount of change in the
language and library and focus most development effort on optimization
instead.  I know of no intrinsic reason why Ruby and Python should not
deliver just about equal performance for similar tasks, though the
current implementations may not be equivalent (yet) from this POV.

Python in recent years moved to enhance its object model (making it in
some ways closer to Ruby) and iteration abilities (ditto), while Ruby
moved to shed more and more of its Perl legacy (though in both cases
legacy issues may have slowed down progress a little bit, yet for both
languages the direction is clear).  This makes the two languages more
similar and to some extent interchangeable.

Ruby has acquired a framework ("Ruby on Rails") that makes it very
desirable to implement database-backed web sites; while Python's running
hot after it (e.g. with Django), for this specific task Rails is better
today (e.g., you can get good _books_ about Rails, but not yet about
Django).  "Ruby Gems" is a good package management system that is fully
operational today; again, Python's catching up, but it's not there yet.

For other things, Python's existing base of third-party extensions and
tools is above Ruby's.  For example, Numeric (and numarray and scipy,
etc) make Python excellent for heavy-duty number crunching, and I do not
know of Ruby equivalents in this area; Twisted is a great framework for
asynchronous network programming, and, again, it's a Python advantage;
and so on, and so forth.  However, there is a downside: for some tasks
(e.g., web-site frameworks) Python has _too many_ good 3rd party
extensions available, making it hard to choose among them!

We can proceed to compare the languages themselves.  Many differences
are minor and cosmetic.  Typically, Python is more specific, giving you
fewer cosmetic choices: e.g., you must use parentheses in defining and
calling functions, while, in Ruby, parentheses are optional (though
often recommended) for these same tasks.  Ruby's approach of having
everything be an object on the same plane, infinitely modifiable until
and unless explicitly frozen, is more regular, while Python's approach
of giving slightly special status to built-in types and making them not
modifiable dynamically is slightly less regular but may be more
pragmatic.  OTOH, Python's approach to "callable objects" (essentially
making them all fully interchangeable) is the more regular of the two.
Ruby's approach to iteration (passing the code block into a method) is
essentially equivalent to Python's (iterators and generators), with
several minor advantages and disadvantages on either side.  One could go
on for quite a long time, but to some extent these are all minor

Re: override a property

2005-10-21 Thread Alex Martelli
Robin Becker <[EMAIL PROTECTED]> wrote:
   ...
> in answer to Bengt & Bruno here is what I'm sort of playing with. Alex
> suggests class change as an answer, but that looks really clunky to me.
> I'm not sure what

Changing class is indeed 'clunky', though it might have been necessary
depending on how one interpreted your original specs.

> Alex means by
> 
> > A better design might be to use, instead of the builtin
> > type 'property', a different custom descriptor type that is specifically
> > designed for your purpose -- e.g., one with a method that instances can
> > call to add or remove themselves from the set of "instances overriding
> > this ``property''" and a weak-key dictionary (from the weakref module)
> > mapping such instances to get/set (or get/set/del, if you need to
> > specialize "attribute deletion" too) tuples of callables.
> 
> I see it's clear how to modify the behaviour of the descriptor instance,
> but is he saying I need to mess with the descriptor magic methods so they
> know what applies to each instance?

If (e.g.) __set__ needs to behave differently when applied to certain
instances rather than others, then it had better be "messed with"
(overridden) compared to property.__set__ since the latter has no such
proviso.  Of course, your architecture as sketched below (taking
advantage of the fact that property.__set__ always calls a certain
callable, and you get to control that callable) is OK too.


> ## my silly example
> class ObserverProperty(property):
>  def __init__(self,name,observers=None,validator=None):
>  self._name = name
>  self._observers = observers or []
>  self._validator = validator or (lambda x: x)
>  self._pName = '_' + name
>  property.__init__(self,
>  fset=lambda inst, value: self.__notify_fset(inst,value),
>  )

Why not just fset=self.__notify_fset ?  I fail to see the added value of
this lambda.  Anyway...:

>  def __notify_fset(self,inst,value):
>  value = self._validator(value)
>  for obs in self._observers:
>  obs(inst,self._pName,value)
>  inst.__dict__[self._pName] = value
> 
>  def add(self,obs):
>  self._observers.append(obs)

...this class only offers sets of observers *per-descriptor instance*,
not ones connected to a specific 'inst' being observed.  My point is,
you could add the latter pretty easily.


> def obs0(inst,pName,value):
>  print 'obs0', inst, pName, value
> 
> def obs1(inst,pName,value):
>  print 'obs1', inst, pName, value
> 
> class A(object):
>  x = ObserverProperty('x')
> 
> a=A()
> A.x.add(obs0)
> 
> a.x = 3
> 
> b = A()
> b.x = 4
> 
> #I wish I could get b to use obs1 instead of obs0
> #without doing the following
> class B(A):
>  x = ObserverProperty('x',observers=[obs1])
> 
> b.__class__ = B
> 
> b.x = 7

You can, if you have a way to call, say, b.x.add_per_inst(b, obs1).
Such as, adding within ObserverProperty:

  self._observers_per_inst = {}

in the init, and changing the notification method to do:

 def __notify_fset(self,inst,value):
 value = self._validator(value)
 observers = self._observers_per_inst.get(inst)
 if not observers: observers = self._observers
 for obs in observers:
 obs(inst,self._pName,value)
 inst.__dict__[self._pName] = value

and a new method add_per_inst:

 def add_per_inst(self, inst, obs):
 self._observers_per_inst.setdefault(inst,[]).append(obs)

Of course, you most likely want to use weak rather than normal
references here (probably to both instances and observers), but that's a
separate issue.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about inheritance...

2005-10-22 Thread Alex Martelli
KraftDiner <[EMAIL PROTECTED]> wrote:

> Well here is a rough sketch of my code...
> This is giving my two problems.
> 
> 1) TypeError: super() argument 1 must be type, not classobj

Make your classes new-style (have Shape inherit from object) to fix
this.  You're using the legacy (old-style) object model (which remains
for backwards compatibility only).

> 2) I want to be sure the the draw code calls the inherited classes
> outline and not its own...

Call anything on self, and you'll use the inherited class.


> class Shape:

change to: class Shape(object):

>   def __init__(self):
>   pass

remove this method, no need for it.

>   def render(self):
>   print self.__class___

Use two trailing underscores, NOT three.

>   self.outline()
>   def outline(self):
>   pass

Use as the body "raise NotImplementedError" to make sure that
Shape.outline never gets accidentally called.

> 
> class Rect(Shape):
>   def __init__(self):
>   super(self.__class__, self).__init__()
>   def render(self):
>   super(self.__class__, self).draw()

You never defined a method named 'draw', do you mean 'render'?

>   def outline(self):
>   print 'outline' + self.__class__



Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs Ruby

2005-10-22 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:

> > Every line = more labour for the developer = more cost and time.
> > Every line = more places for bugs to exist = more cost and time.
> 
> There were studies done in the 70s that showed that programmers
> produced the same number of debugged lines of code a day no matter
> what language they used. So a language that lets you build the same
> program with fewer lines of code will let you build the program in
> less time.

Of course, these results only apply where the "complexity" (e.g., number
of operators, for example) in a single line of code is constant.  There
is no complexity advantage to wrapping up code to take fewer LINES, as
such -- e.g., in Python:

for item in sequence: blaap(item)

or

for item in sequence:
blaap(item)

are EXACTLY as easy (or hard) to write, maintain, and document -- it's
totally irrelevant that the number of lines of code has "doubled" in the
second (more standard) layout of the code!-)

This effect is even more pronounced in languages which allow or
encourage more extreme variation in "packing" of code over lines; e.g.,
C, where

for(x=0; x<23; x++) { a=seq[x]; zap(a); blup(a); flep(a); }

and

for(x=0;
 x<23;
 x++)
  { 
a=seq[x]; 
zap(a);
blup(a);
flep(a);
  }

are both commonly used styles -- the order of magnitude difference in
lines of code is totally "illusory".


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: C extension modules in Python

2005-10-22 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:

> Hello,
> 
> I'vre written an extension module to accelarate some code i've made in
> python with numarray. Then i compiled an linke d it with swig, my
> problem is that when i make the import in my python code it gives me an
> error: ImportError: libnumarray.so: cannot open shared object file: No
> such file or directory
> 
> does anyone know why this hapens and how can i solve it?

It seems that the libnumarray.so (which your extension is probably
trying to load) is not in a directory where your system will like
loading it from.  It's hard to say more without knowing about your
system, and the way you've set things up for it in terms of loading of
dynamic libraries (which IS a very system-dependent thing).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IDE recommendation please

2005-10-22 Thread Alex Martelli
microsnot <[EMAIL PROTECTED]> wrote:

> I'm new to Python but am wondering what IDE Python developers use? I use Mac
> OS X 10.4.2. I have PythonIDE which comes with MacPython but I don't think
> that has even rudimentary "intellisense". Xcode and Eclipse don't seem to
> support Python out of the box. Suggestions for plugins for Eclipse would
> also be nice.

On the Mac, I think the XCode integration you get with PyObjC is
probably best.  I know there are plugins for Eclipse but haven't tried
any personally, so it's hard to make suggestions (I'm a dinosaur, and I
prefer to develop with GVim + a command-line tool, such as Python's own
interactive mode...).  I'm not sure if BlackAdder (simplest and fastest
to learn) and WingIDE (probably THE one most powerful Python IDE) work
on the Mac (shame on me, as a Python AND Mac enthusiast, for not having
tried them...), but they're surely worth investigating.  Ditto for
ActiveState's Komodo tool...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs Ruby

2005-10-23 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:
   ...
> > Of course, these results only apply where the "complexity" (e.g., number
> > of operators, for example) in a single line of code is constant.
> 
> I'm not sure what you're trying to say here. The tests ranged over
> things from PL/I to assembler. Are you saying that those two languages
> have the same "complexity in a single line"?

Not necessarily, since PL/I, for example, is quite capable of usages at
extremes of operator density per line.  So, it doesn't even have "the
same complexity as itself", if used in widely different layout styles.

If the studies imply otherwise, then I'm reminded of the fact that both
Galileo and Newton published alleged experimental data which can be
shown to be "too good to be true" (fits the theories too well, according
to chi-square tests etc)...


> > for item in sequence: blaap(item)
> >
> > or
> >
> > for item in sequence:
> > blaap(item)
> >
> > are EXACTLY as easy (or hard) to write, maintain, and document -- it's
> > totally irrelevant that the number of lines of code has "doubled" in the
> > second (more standard) layout of the code!-)
> 
> The studies didn't deal with maintenance. They only dealt with
> documentation in so far as code was commented.
> 
> On the other hand, studies of reading comprehension have shown that
> people can read and comprehend faster if the line lengths fall within
> certain ranges. While it's a stretch to assume those studies apply to
> code, I'd personally be hesitant to assume they don't apply without
> some reseach. If they do apply, then your claims about the difficulty
> of maintaining and documenting being independent of the textual line
> lengths are wrong. And since writing code inevitable involves
> debugging it - and the studies specified debugged lines - then the
> line length could affect how hard the code is to write as well.

If time to code depends on textual line lengths, then it cannot solely
depend on number of lines at the same time.  If, as you say, the studies
"prove" that speed of delivering debugged code depends strictly on the
LOCs in the delivered code, then those studies would also be showing
that the textual length of the lines is irrelevant to that speed (since,
depending on coding styles, in most languages one can trade off
textually longer lines for fewer lines).

OTOH, the following "mental experiment" shows that the purported
deterministic connection of coding time to LOC can't really hold:

say that two programmers, Able and Baker, are given exactly the same
task to accomplish in (say) language C, and end up with exactly the same
correct source code for the resulting function;

Baker, being a honest toiling artisan, codes and debugs his code in
"expansive" style, with lots of line breaks (as lots of programming
shops practice), so, given the final code looks like:
while (foo())
  {
bar();
baz();
  }
(etc), he's coding 5 lines for each such loop;

Able, being able, codes and debugs extremely crammed code, so the same
final code looks, when Able is working on it, like:
while (foo()) { bar(); baz(); }
so, Able is coding 1 line for each such loop, 5 times less than Baker
(thus, by hypothesis, Able must be done 5 times faster);

when Able's done coding and debugging, he runs a "code beautifier"
utility which runs in negligible time (compared to the time it takes to
code and debug the program) and easily produces the same "expansively"
laid-out code as Baker worked with all the time.

So, Able is 5 times faster than Baker yet delivers identical final code,
based, please note, not on any substantial difference in skill, but
strictly on a trivial trick hinging on a popular and widely used kind of
code-reformatting utility.


Real-life observation suggests that working with extremely crammed code
(to minimize number of lines) and beautifying it at the end is in fact
not a sensible coding strategy and cannot deliver such huge increases in
coding (and debugging) speed.  Thus, either those studies or your
reading of them must be fatally flawed in this respect (most likely,
some "putting hands forward" footnote about coding styles and tools in
use was omitted from the summaries, or neglected in the reading).

Such misunderstandings have seriously damaged the practice of
programming (and managements of programming) in the past.  For example,
shops evaluating coders' productivity in terms of lines of code have
convinced their coders to distort their style to emit more lines of code
in order to be measured as more productive -- it's generally trivial to
do so, of course, in many cases, e.g.
for i in range(100):
a[i] = i*i
can easily become 100 lines "a[0] = 0" and so on (easily produced by
copy and paste or editor macros, or other similarly trivial means).  At
the other extreme, some coders (particularly in languages suitable for
extreme density, such as Perl) delight in producing "one-liner"
(unreadable) ``very clever'' equiv

Re: How to separate directory list and file list?

2005-10-23 Thread Alex Martelli
Gonnasi <[EMAIL PROTECTED]> wrote:

> With
> >glob.glob("*")
> 
> or
> >os.listdir(cwd)
> 
> I can get a combined file list with directory list, but I just wanna a
> bare file list, no directory list. How to get it?

I see everybody's suggesting os.path.* solutions, and they're fine, but
an interesting alternative is os.walk:

__, thedirs, thefiles = os.walk('.').next()

thefiles is the list of filenames (and thedirs is the list of directory
names), and each is sorted alphabetically.  (I'm assigning to '__' the
absolute path of the current directory, meaning I intend to ignore it).
An expression that just provides the filename list is

  os.walk('.').next()[2]

although this may be a tad too obscure to recommend it!-)


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Syntax across languages

2005-10-23 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:
   ...
> - Information about the current line and file as Ruby:
> __LINE__ __FILE__
> Instead of the python version:
> inspect.stack()[0][2] inspect.stack()[0][1]

__file__ is around in Python, too, but there's no __line__ (directly).

> - identity function: "identity" as in Common Lisp (probably of little
> use in Python).

I've seen enough occurrences of "lambda x: x" in Python code with a
generally functional style that I'd love to have operator.identity (and
a few more trivial functions like that) for readability;-)

> - object cloning: obj.copy()  obj.deepcopy()

Like (say) container.length() versus len(container), I'm perfectly
comfortable relying on functions rather than methods.  It even makes it
easier to allow several alternative ways for an object to provide such
functionality (e.g. by implementing __getstate__ and maybe __setstate__
as opposed to __copy__ and maybe __deepcopy__) -- which would be
feasible even with a method, of course (Template Method DP), but IS
easier when relying on functions (and operators).

> - accessing parent method:
> super as in Ruby, instead as in Python:
> super(Class, self).meth(args)

Ruby's syntax may be better for a single-inheritance language, but
Python's, while less elegant, may be more appropriate in the presence of
multiple inheritance.

> - recursive "flatten" as in Ruby (useful)

Usage too rare to deserve a built-in method, IMHO, considering the ease
of coding the equivalent:

def flatten(x):
if not isinstance(x, list): yield x
for y in x: yield flatten(y)

What I _do_ envy Ruby's syntax, a little, is the convention of ending
methodnames with exclamation mark to indicate "modifies in-place" (and,
secondarily, question mark to indicate predicates).  The distinction
between, e.g.,
y = x.sort()
and
x.sort!()
in Ruby is much clearer, IMHO, than that between, say,
y = sorted(x)
and
x.sort()
in Python...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Syntax across languages

2005-10-23 Thread Alex Martelli
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> just curious, how can this identity function be used ? In haskell,
> because all functions are curried, I can sort of visualize/understand
> how id is used. Not quite understand how it can be used in python.

There was a very recent example posted to this group (by Robin Becker, I
believe, looking for ways to "override property"), something like:
def __init__(self, validate=None):
if not validate: validate = lambda x: x
self.validate = validate
and later on, self.validate is always unconditionally called to extract
a valid value from an input argument to another method -- a nicer style,
arguably, than assigning self.validate unconditionally and then having
to test each and every time in the other method.  Such subcases of the
well-known "Null Object" design pattern, where you don't want to store
None to mean "no such object" but, to avoid repeated testing, rather
want to store an object which "reliably does nothing", are reasonably
common.  In an abstract way, they're somewhat akin to having the 'pass'
statement in the language itself;-).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Syntax across languages

2005-10-23 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:

> Alex> I've seen enough occurrences of "lambda x: x" in Python code with
> Alex> a generally functional style that I'd love to have
> Alex> operator.identity (and a few more trivial functions like that) for
> Alex> readability;-)
> 
> But, but, but [Skip gets momentarily apoplectic, then recovers...]
> "operator.identity" is way more to type than "lambda x: x".  Plus you have
> to remember to import the operator module. <0.5 wink>

But, it's way more readable, IMHO.


> Not to mention which (from "pydoc operator"):
> 
> [The operator module] exports a set of functions implemented in C
> corresponding to the intrinsic operators of Python.
> 
> Last time I checked, Python didn't have an intrinsic "identity" operator.

attrgetter and itemgetter don't exactly correspond to "intrinsic
operators", either (getitem, and the built-in getattr, are more like
that)... yet module operator exports them today, offering more readable
(and faster) alternatives to "lambda x: x[y]" and "lambda x: getattr(x,
y)".

 
> For which reason, I'd be -1 on the idea of an identity function, certainly
> in the operator module.

I'm not at all wedded to having 'identity' in the operator module (which
may arguably be the wrong placement for 'attrgetter' and 'itemgetter'
too).  But identity is a useful primitive for a functional style of
Python programming, so having it SOMEwhere would be nice, IMHO.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: calling a dylib in python on OS X

2005-10-23 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:

> Hi
> 
> Is there something similar to python's windll for calling DLLs on win32
> but meant for calling dylib's on OS X?

ctypes should work just fine on Macs, just as well as on Windows and
Linux machines as well as many other Unix dialects.  Try
http://starship.python.net/crew/theller/ctypes/ ...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Microsoft Hatred FAQ

2005-10-23 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:
...
> David claimed that everyone had a right to do whatever they wanted
> with their property. This is simply false throughout most of the
> civilized world - zoning laws control what kinds of business you can

Incidentally, the perfectly good rationale for this universal existence
of limitations to "doing whatever you want with your property" is known
in economics as *externalities*.  Transactions that appear to involve
just one or two parties, and be entirely voluntary between them, may in
fact produce all sort of beneficial or detrimental effects on further
parties who have not necessarily agreed to that.  For example, I may
"own" a certain lot of land, but if on that lot I place a siren blaring
and a huge flashing red sign, the energy of the sound waves and light
will inevitably also affect other nearby places, which I do _not_ "own"
(either they're commons, or owned by somebody else), imposing an
externality on owners and/or users of those nearby places.

Of course, while some externalities are entirely obvious (it's hard to
argue against such sirens and flashing lights being otherwise), many
others are subtler and more debatable, so one reasonable society might
acknowledge a certain class of externality and try to regulate it while
another might prefer not to do so.  But the general concept of society
as a whole placing limitations on private owners' uses of the property,
based on externalities certain uses might impose on unwilling parties,
is as solid as a rock, both practically and theoretically -- however
much anarchists or extreme libertarians might wish otherwise.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Syntax across languages

2005-10-23 Thread Alex Martelli
Tom Anderson <[EMAIL PROTECTED]> wrote:
   ...
> What would approximate FP equality even mean? How approximate?

In APL, it meant "to within [a certain quad-global whose name I don't
recall] in terms of relative distance", i.e., if I recall correctly,
"a=b" meant something like "abs(a-b)/(abs(a)+abs(b)) < quadEpsilon" or
thereabouts.  Not too different from Numeric.allclose, except the latter
is "richer" (it takes both absolute and relative "epsilons", and also
implies an "and-reduce" if the objects being compared are arrays).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to count and extract images

2005-10-23 Thread Alex Martelli
Joe <[EMAIL PROTECTED]> wrote:

> I'm trying to get the location of the image uisng 
> 
> start = s.find(' stop = s.find('">Save File',
> start) fileName = s[start:stop]
> and then construct the url with the filename to download the image 
> which works fine as cause every image has the Save File link and I can
> count number of images easy the problem is when there is more than image I
> try using while loop downlaod files, wirks fine for the first one but
> always matches the same, how can count and thell the look to skip the fist
> one if it has been downloaded and go to next one, and if next one is
> downloaded go to next one, and so on.

Pass the index from where the search must start as the second argument
to the s.find method -- you're already doing that for the second call,
so it should be pretty obvious it will also work for the first one, no?


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tricky Areas in Python

2005-10-23 Thread Alex Martelli
PyPK <[EMAIL PROTECTED]> wrote:

> What possible tricky areas/questions could be asked in Python based
> Technical Interviews?

I like to present code that seems like it should work, but has some kind
of relatively subtle problem, either of correctness in some corner case,
or of performance, etc -- and I ask them what they would say if they
were to code-review that code, or how they would help a student who came
to them with that code and complaints about it not working, &c.

This tells me whether they have real-world Python experience, and how
deep, or whether they've carefully studied the appropriate areas of
"Python in a Nutshell" and the Cookbook (and I'm biased enough to think
that the second kind of preparation is almost as good as the first
kind...;-).

Not sure whether you think this count as "tricky"... they're typically
problems that do come up in the real world, from (e.g.):
for string_piece in lots_of_pieces:
bigstring += string_piece
(a typical performance-trap) to
for item in somelist:
if isbad(item):
somelist.remove(item)
(with issues of BOTH correctness and performance), to
class Sic:
def getFoo(self): ...
def setFoo(self): ...
foo = property(getFoo, setFoo)
to
class Base(object)
def getFoo(self): ...
def setFoo(self): ...
foo = property(getFoo, setFoo)

class Derived(Base):
def getFoo(self): 

and so on, and so forth.  If a candidate makes short work of a couple of
these, and I've been asked to focus my part of the interview solely on
Python coding, I may branch out into more advanced stuff such as asking
for an example use case for a closure, a custom descriptor, or an import
hook, for example -- those are the cases in which I'm trying to decide
if, on a scale of 1 to 5, the candidate's Python competence is about 4
or well over 4 (I would not consider having no idea of why one might
want to code a custom descriptor to be at all "disqualifying" -- it
would just mean I'd rate the candidate 4 out of five, instead of 4.5 or
more, for Python coding competence).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tricky Areas in Python

2005-10-23 Thread Alex Martelli
Andrew Durdin <[EMAIL PROTECTED]> wrote:

> On 10/24/05, Alex Martelli <[EMAIL PROTECTED]> wrote:
> > I may branch out into more advanced stuff such as asking
> > for an example use case for a closure, a custom descriptor, or an import
> > hook, for example
> 
> Isn't that approaching things from the wrong angle? You're asking them
> to synthesise a problem for a given solution, rather than analyse a
> problem to determine an appropriate solution. Asking questions like
> these tests memory more than competence -- for example, if you ask me
> of a use case for a closure, the only answer I could give would be to
> remember a problem I'd solved in the past using one.

And why do you think that would be wrong?  If you've used closures, you
know what you've used them for, and (I would hope) why.  If you've never
used them, you're welcome to answer "I have no idea why anybody would
wanna use THAT crazy thing for" (I always give points for honesty;-), or
else try to bluff your way through (sorry, no points for chutzpah!-).

I don't know of any issue that could be solved ONLY by a closure (we
didn't have closures in 1.5.2 yet we made out excellently well
anyhow;-), after all.  The point is, does the candidate really
understand closures (ideally by practical experience)?  Within the
limited confines of a less-than-an-hour interview (which is what we
normally use -- several interviewers, but no more than about 45 minutes
each, with different focus for each interviewer) I believe that asking
for use cases is a perfectly good way to gauge if a candidate fully
understands (ideally by experience) a certain language feature.

It's not just Python, btw.  When I'm asked to focus on C++ skills, I
will similarly ask, e.g., what a use case would be for virtual
inheritance, say.  How ELSE would you gauge, within that very limited
time-span, a candidate's grasp of some advanced language feechur?-)


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: High Order Messages in Python

2005-10-23 Thread Alex Martelli
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> could someone enlighten me what is the advantage of block over named
> function ?
> 
> One thing that I can see a difference may be lexical scope ?

"Yes, but" -- according to the latest Ruby book, the "mixed lexical
scope" of blocks is a highly controversial notion in the Ruby community;
so I wouldn't necessarily count it as an _advantage_ of Ruby blocks...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: High Order Messages in Python

2005-10-24 Thread Alex Martelli
Kent Johnson <[EMAIL PROTECTED]> wrote:
   ...
> For example to open a file and read from it uses two closures, one to wrap
> a block with the file open/close, one to iterate lines (from the pickaxe
> book):
> 
> File.open("testfile") do |file|
>   file.each_line { |line| puts line }
> end

Good example -- Ruby blocks are used both for iteration, and for
non-iterative wrapping of a single action with "try-finally" semantics.

Python's generators, up to 2.4, don't really support the non-iterative
part of this well, and have other limitations (they can get values "out"
with yield, but can't easily or naturally get results back "in"...).  In
the forthcoming 2.5, Python generators will be enriched with enough
functionality for these purposes, and a new statement "with" to clearly
indicate the non-iterative case (while "for" indicates iteration).

So, in Python 2.5, the above snippet may become, in Python:

with opening("testfile") as my_file:
for line in my_file:
print line,

The fact that 2.5 will acquire extra functionality to "round out"
generators &c to the power of Ruby blocks may be taken as a clear
indication that, right now (Ruby 1.8.* vs Python 2.4.*), Ruby's blocks
are somewhat more powerful.  As for the differences in style that will
remain when Python 2.5 is born, we'll be back to "personal taste" level
issues, it appears to me.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tricky Areas in Python

2005-10-24 Thread Alex Martelli
Steven D'Aprano <[EMAIL PROTECTED]> wrote:
   ...
> my hard-won ignorance, and admit that I don't see the 
> problem with the property examples:
> 
> > class Sic:
> > def getFoo(self): ...
> > def setFoo(self): ...
> > foo = property(getFoo, setFoo)

Sorry for skipping the 2nd argument to setFoo, that was accidental in my
post.  The problem here is: class Sic is "classic" ("legacy",
"old-style") so property won't really work for it (the setter will NOT
trigger when you assign to s.foo and s is an instance of Sic).

> > to
> > class Base(object)
> > def getFoo(self): ...
> > def setFoo(self): ...
> > foo = property(getFoo, setFoo)
> > 
> > class Derived(Base):
> > def getFoo(self): 
> 
> Unless the answer is "Why are you using setters and 
> getters anyway? This isn't Java you know."

Nope, that's not a problem -- presumably the "..." bodies DO something
useful, and they do get nicely dressed in attribute syntax.  The
problem, as others have indicated, is that overriding doesn't work as
one might expect -- the solution, in Python 2.4 and earlier, is to use
one extra level of indirection:
def __getFoo(self): return self.getFoo()
def getFoo(self): ...
foo = property(__getFoo)
so the name lookup for 'getFoo' on self happens when you access s.foo
(for s being an instance of this here-sketched class) and overriding
works just as expected.  This can be seen as the simplest possible use
case for the "Template Method" Design Pattern, btw;-)


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs Ruby

2005-10-24 Thread Alex Martelli
Michele Simionato <[EMAIL PROTECTED]> wrote:

> Alex Martelli wrote:
> > ... remember Pascal's "Lettres Provinciales",
> > and the famous apology about "I am sorry that this letter is so long,
> > but I did not have the time to write a shorter one"!-)
> 
> This observation applies to code too. I usually spend most of my time
> in making short programs
> that would have been long. This means:

Absolutely true.

> cutting off non-essential features (and you can discover that a feature
> is non essential only after having implemented it)

This one is difficult if you have RELEASED the program with the feature
you now want to remove, sigh.  You end up with lots of "deprecated"s...
somebody at Euro OSCON was saying that this was why they had dropped
Java, many years ago -- each time they upgraded their Java SDK they
found out that half their code used now-deprecated features.

Still, I agree that (once in a while) accepting backwards
incompatibility by removing features IS a good thing (and I look
forwards a lot to Python 3.0!-).  But -- the "dream" solution would be
to work closely with customers from the start, XP-style, so features go
into the code in descending order of urgence and importance and it's
hardly ever necessary to remove them.

> and/or
> 
> rethinking the problem to a superior level of abstraction (only
> possible after you have implented
> the lower level of abstraction).

Yep, this one is truly crucial.

But if I had do nominate ONE use case for "making code smaller" it would
be: "Once, And Only Once" (aka "Don't Repeat Yourself").  Scan your code
ceaselessly mercilessly looking for duplications and refactor just as
mercilessly when you find them, "abstracting the up" into functions,
base classes, etc...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs Ruby

2005-10-24 Thread Alex Martelli
Jorge Godoy <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] (Alex Martelli) writes:
> 
> > forwards a lot to Python 3.0!-).  But -- the "dream" solution would be
> > to work closely with customers from the start, XP-style, so features go
> > into the code in descending order of urgence and importance and it's
> > hardly ever necessary to remove them.
> 
> We do that often with two of our customers here.  After the first changes,
> they asked for more.  And them some other and when it finally ended, the
> project was like we had suggested, but instead of doing this directly, the
> client wanted to waste more money... :-(  Even if we earnt more money, I'd
> rather have the first proposal accepted instead of wasting time working on
> what they called "essential features". 

The customer is part of the team; if any player in the team is not
performing well, the whole team's performance will suffer -- that's
hardly surprising.  You may want to focus more on _teaching_ the
customer to best play his part in the feature-selection game, in the
future... not easy, but important.


> > But if I had do nominate ONE use case for "making code smaller" it would
> > be: "Once, And Only Once" (aka "Don't Repeat Yourself").  Scan your code
> > ceaselessly mercilessly looking for duplications and refactor just as
> > mercilessly when you find them, "abstracting the up" into functions,
> > base classes, etc...
> 
> And I'd second that.  Code can be drastically reduced this way and even
> better: it can be made more generic, more useful and robustness is improved.

I'll second all of your observations on this!-)


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tricky Areas in Python

2005-10-24 Thread Alex Martelli
beza1e1 <[EMAIL PROTECTED]> wrote:

> let me try.
> 
> 1) ''.join(lots_of_pieces)

Yep.

> 2) This doesn't even work, if something is removed, the list is too
> short. So:
> [x for x in somelist if not isbad(x)]
> well, list comprehension is Python 2.4 and 2.3 is the standard in many
> OSes, so it is possibly not the most portable solution

No, LC goes back a long way -- I think it was in 2.0 already, 2.1 for
sure.  If you have to support (say) Python 1.5.2, the simplest (not
fastest) solution is to loop on a COPY of the list, rather than the very
list you're also modifying.

> I had to look up the syntax, because i never use it in my code, yet.
> 
> 3+4) I never used property - had to look it up. So i learned something
> :)

Good!-)


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tricky Areas in Python

2005-10-25 Thread Alex Martelli
Fredrik Lundh <[EMAIL PROTECTED]> wrote:

> Alex Martelli wrote:
> 
> >> > class Sic:
> >> > def getFoo(self): ...
> >> > def setFoo(self): ...
> >> > foo = property(getFoo, setFoo)
> >
> > Sorry for skipping the 2nd argument to setFoo, that was accidental in my
> > post.  The problem here is: class Sic is "classic" ("legacy",
> > "old-style") so property won't really work for it (the setter will NOT
> > trigger when you assign to s.foo and s is an instance of Sic).
> 
> what's slightly confusing is that the getter works, at least until you attempt
> to use the setter:

Oh yes, that IS definitely contributing to the confusion -- which is
part of why I think it makes sense to claim this is a "tricky area".

> (a "setter isn't part of an new-style object hierarchy" exception would have
> been nice, I think...)

Agreed.  Alas, a bit too late now, I fear (until 3.0 when old-style goes
away) -- somebody might be (unwisely but "it-sort-of-works" style)
relying on this behavior.  Hmmm, maybe a WARNING could be given...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tricky Areas in Python

2005-10-25 Thread Alex Martelli
Tim Roberts <[EMAIL PROTECTED]> wrote:

> "PyPK" <[EMAIL PROTECTED]> wrote:
> >
> >What possible tricky areas/questions could be asked in Python based
> >Technical Interviews?
> 
> What's the point of asking "tricky" questions?  Aren't you really more
> interested in what applications they have worked on and whether they were
> successful?

Understanding exactly what contribution they did, to the applications
they worked on, may be even more important.  After all, we've all seen
people who worked on app "X" and made a NEGATIVE net contribution to
"X"'s success, compensated by the far better efforts of others on the
team.  But it's unlikely that the candidate will actually say "I worked
on X, but my grasp of the language was so feeble that in fact I made it
_worse_ than it would have been without me in the team";-).

 
> I don't know.  I'm not sure I need a job with a company that insists on
> playing "Programming Jeopardy" during the interview.

I understand how some "quiz-like" questions may rub you the wrong way.
For questions such that knowing the answers is of little or no use on
the job I might even concur -- e.g., "On what exact date was Python 1.0
released" (even if the candidate claims to have been actively part of
that release, he might perfectly well misremember the date!-).

Others are iffier, e.g., "what are all the optional arguments to
built-in function 'open'" -- some people easily memorize those even if
they use them once in a blue moon, others look them up in 2 seconds
every time they need to with a "help(open)" on the interactive
interpreter prompt, so while the knowledge sure doesn't hurt it's not
very important.  That's even truer for other functions and classes with
many more, and more obscure, optional arguments.

If I asked such a question in the course of a programming task, I would
consider "hmmm, there's an optional argument for that, I don't remember
it but in real life I'd look it up in a jiffy" perfectly acceptable (if
the candidate appears to have no idea that there IS an optional argument
for a cerrtain purpose, that's slightly less good, but no disaster).

But -- being presented with a flawed solution and asked to help debug it
(by identifying the "tricky issue" on which it fails) appears to me to
be perfectly acceptable.  It IS the kind of task you face all the time
IRL, and checks how well you know the language and libraries...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Set an environment variable

2005-10-26 Thread Alex Martelli
Grant Edwards <[EMAIL PROTECTED]> wrote:

> On 2005-10-24, Eric Brunel <[EMAIL PROTECTED]> wrote:
> 
> >> The only think you can export an environment variable to is a
> >> child process
> >
> > Well, you know that, and I know that too. From my experience,
> > many people don't...
> 
> True.  Using Unix for 20+ years probably warps one's perception
> of what's obvious and what isn't.

This specific issue is identical in Windows, isn't it?  I do not know
any OS which does have the concept of "environment variable" yet lets
such variables be ``exported'' to anything but a child process.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: namespace dictionaries ok?

2005-10-26 Thread Alex Martelli
Ron Adam <[EMAIL PROTECTED]> wrote:
   ...
>  class namespace(dict):
>  def __getattr__(self, name):
>  return self.__getitem__(name)
   ...
> Any thoughts?  Any better way to do this?

If any of the keys (which become attributes through this trick) is named
'update', 'keys', 'get' (and so on), you're toast; it really looks like
a nasty, hard-to-find bug just waiting to happen.  If you're really
adamant on going this perilous way, you might try overriding
__getattribute__ rather than __getattr__ (the latter is called only when
an attribute is not found "in the normal way").

If you think about it, you're asking for incompatible things: by saying
that a namespace X IS-A dict, you imply that X.update (&c) is a bound
method of X; at the same time, you also want X.update to mean just the
same thing as X['update'].  Something's gotta give...!-)


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to replace all None values with the string "Null" in a dictionary

2005-10-27 Thread Alex Martelli
Bengt Richter <[EMAIL PROTECTED]> wrote:
   ...
> Which is probably more efficient than one-liner updating the dict with
> 
> mydict.update((k,'Null') for k,v in mydict.items() if v is None)

...which in turn is probably better than

_auxd = {None: "Null"}
newd = dict((k, _auxd.get(k, c) for k, c in mydict.items())

[which might be a nice idea if you wanted to do _several_
substitutions;-)...]


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Double replace or single re.sub?

2005-10-27 Thread Alex Martelli
Iain King <[EMAIL PROTECTED]> wrote:

> I have some code that converts html into xhtml.  For example, convert
> all  tags into .  Right now I need to do to string.replace calls
> for every tag:
> 
> html = html.replace('','')
> html = html.replace('','')
> 
> I can change this to a single call to re.sub:
> 
> html = re.sub('<([/]*)i>', r'<\1em>', html)
> 
> Would this be a quicker/better way of doing it?

*MEASURE*!

Helen:~/Desktop alex$ python -m timeit -s'import re; h="aap"' \
> 'h.replace("", "").replace("", "")'
10 loops, best of 3: 4.41 usec per loop

Helen:~/Desktop alex$ python -m timeit -s'import re; h="aap"' \>
're.sub("<([/]*)i>", r"<\1em>}", h)'
1 loops, best of 3: 52.9 usec per loop
Helen:~/Desktop alex$ 

timeit.py is your friend, remember this...!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generic utility class for passing data

2005-10-28 Thread Alex Martelli
Gordon Airporte <[EMAIL PROTECTED]> wrote:

> I'm wondering if this is might be bad practice. Sometimes when I need to

I hope not, 'cuz I suggested that years ago on the Cookbook (under the
name of Bunch) with several successive refinements.

> class Dummy:
>   pass
> 
> Then when I need to pass some related data, Python lets me do this:
> 
> prefill = Dummy()
> prefill.foreground = 'blue'  #"foreground" is made up on the fly
> prefill.background = 'red'
> prefill.pattern = mypattern
> return prefill

Sure, but you can do even better:

class Dummy(object):
def __init__(self, **kwds): self.__dict__ = kwds

prefill = Dummy(foreground='blue', background='red', pattern=mypattern)
return prefill


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How do I sort these?

2005-10-28 Thread Alex Martelli
Paul Rubin  wrote:

> "KraftDiner" <[EMAIL PROTECTED]> writes:
> > In C++ you can specify a comparision method, how can I do this with
> > python...
> 
> Yes, see the docs.  Just pass a comparison func to the sort method.

Or, better, pass a key-extraction function, that's much handier and
faster (it automates the "decorate-sort-undecorate", DSU, idiom).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question: string replace

2005-10-28 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:

> "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> writes:
> 
> > So how to overwrite the config file directly in script.py instead of
> > running script.py with two params?
> 
> Don't overwrite the file directly. Save a copy, then rename it. That

See module fileinput in the standard library, it does this for you
automatically when used in the right way -- much better than rolling
your own and having to debug it and maintain it forever!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question: string replace

2005-10-28 Thread Alex Martelli
Steven D'Aprano <[EMAIL PROTECTED]> wrote:

> On Fri, 28 Oct 2005 12:27:36 -0700, [EMAIL PROTECTED] wrote:
> 
> > hm...Is there a way to get rid of the newline in "print"?
> 
> Yes, by using another language *wink*

Ending the print statement with a comma also works;-)

 
> Or, instead of using print, use sys.stdout.write().

Yep, generally better than 'print'.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Scanning a file

2005-10-28 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:
   ...
> Except if you can't read the file into memory because it's to large,
> there's a pretty good chance you won't be able to mmap it either.  To
> deal with huge files, the only option is to read the file in in
> chunks, count the occurences in each chunk, and then do some fiddling
> to deal with the pattern landing on a boundary.

That's the kind of things generators are for...:

def byblocks(f, blocksize, overlap):
block = f.read(blocksize)
yield block
while block:
block = block[-overlap:] + f.read(blocksize-overlap)
if block: yield block

Now, to look for a substring of length N in an open binary file f:

f = open(whatever, 'b')
count = 0
for block in byblocks(f, 1024*1024, len(subst)-1):
count += block.count(subst)
f.close()

not much "fiddling" needed, as you can see, and what little "fiddling"
is needed is entirely encompassed by the generator...


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Opaque documentation

2005-10-28 Thread Alex Martelli
Mike Meyer <[EMAIL PROTECTED]> wrote:

> "Ben Sizer" <[EMAIL PROTECTED]> writes:
> > Documentation is often a problem with Python and its libraries, sadly.
> > The same almost certainly goes for most open source projects.
> 
> You over-specified the last clause.  It should say "most software
> projects."

You over-specified the last clause.  It should say "most projects."


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't this work? :)

2005-10-28 Thread Alex Martelli
Jeremy Moles <[EMAIL PROTECTED]> wrote:

> Am I misunderstanding something fundamental about the builtin __*
> functions? Can they not be "static?"

They can and must be static when their specification say they are (e.g.,
__new__) and they cannot and must not be static when their specification
says otherwise (just about all of them).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions between these two books

2005-10-28 Thread Alex Martelli
Micah Elliott <[EMAIL PROTECTED]> wrote:

> On Oct 26, John Salerno wrote:
> > Hi all. I'm fairly new to programming and I thought I'd like to try
> > Python. I'm trying to decide between these two books:
> > 
> > Learning Python (O'Reilly)
> > Beginning Python: From Novice to Professional (APress)
> 
> Consider first reading the tutorial.  If you prefer to read from paper
> there is a PDF version
> .

...but both of the quoted books have added value.

Well, I don't actually KNOW that about the APress one, since my good
friend Magnus Hetland didn't think of sending me a review copy (hint,
hint, Magnus, if you want any more recommendations;-), but its
predecessor "Pratical Python" was good indeed.

 
> There is also the "Python in a Nutshell" book which only covers Python
> 2.2 but has a very concise language intro, and will become an
> invaluable reference.  I wish I had started with this book; then I
> wouldn't have needed to buy some of the others.

Why, thanks!  I'm working on a new edition to cover 2.3 and 2.4 (and
perhaps 2.5 by the time I'll be done, as progress is being quite slow --
as uber technical lead at Google, I'm pretty busy these days!-), but I
do agree that the current edition is still quite useful.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pickling class instances with __slots__

2005-10-28 Thread Alex Martelli
Alex <[EMAIL PROTECTED]> wrote:
   ...
> I have a series of new classes with child-parent relationship and each
> has unique __slots__. They don't have __dict__ . I need to be able to
> pickle and unpickle them. As far as I could understand, I need to
> provide __getstate__  and  __setstate__ methods for each class. Is

Right.

> there a universally accepted code for each method? If so, what is it?
> If there is no standard, what works?

Lots of things work, the simplest is something like:

>>> class wehaveslots(object):
...   __slots__ = 'a', 'b', 'c'
...   def __getstate__(self): return self.a, self.b, self.c
...   def __setstate__(self, tup): self.a, self.b, self.c = tup

(plus presumably other methods, but those don't matter for pickle).


Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Expanding Python as a macro language

2005-10-29 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:
   ...
> But the problem is that in Linux you can't even send a keystroke to
> a running GUI application!

Actually, if the app is running under X11 you may try to fake out a
keystroke event (with low level calls, but ctypes might let you use it
from Python).  Of course, the app WILL be told that the keystroke is
fake, through a special flag if it cares to check for it, for security
reasons; but if the app doesn't specifically defend itself in this way.

See, for example, http://xmacro.sourceforge.net/ -- I guess that
xmacroplay could pretty easily be adapted, or maybe even used as is with
an os.popen.


> I want to find a solution in Linux, with the help of experts
> (if they don't use only Windows...)  for two reasons:
> - the reduced availability in Windows of "free" or "open" applications
> - the more severe security problems in Windows.
> Concerning the second point, you can correctly argue that this is,
> at least partly, due to the wider market share of Windows but IMHO
> Linux is more robust in this field, and ...at the present times the
> situation is like that!

Don't neglect MacOSX -- it's quite secure, there are many open and free
applications, AND it has a decent architecture for the kind of tasks you
want to do (mostly intended for Apple's own Applescript language, but
all the interfaces are open and easily available to Python, which is
quite well supported on the Mac).

It also has an excellent italian newsgroup, it.comp.macintosh -- quite
high volume (as it discusses ANYthing Apple, from iPod shuffles to
golden oldies to rumors about new servers &c, with a lot of volume on
the audio and video applications that macs do so well) but worth it.


However, all the specific use cases you describe are best handled by
either fully emulating or directly integrating with a browser; faking
keystrokes is definitely too low-level an approach.  Python is very good
at dealing with the web (it IS, apparently, the favourite language of
Tim Berners Lee --- he came give a keynote at a Python conference),
including recording and replaying cookies and anything else you may need
to make a "special purpose browser" for automation purposes. Twisted is
an asynchronous framework for very well-performing, lightweight clients
and servers -- or, Python's standard library can suffice if you're not
in a terrible hurry;-).

Alternatively, Firefox and other Mozilla Foundation apps are designed to
be automated via XPCOM, essentially a cross-platform equivalent of
Microsoft's good old COM, and there are Python interfaces to it (some
future Firefox version might perhaps integrate a Python engine, just
like it integrates a Javascript engine today, but I wouldn't hold my
breath waiting;-).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: py.log using decorators for DRY

2005-10-29 Thread Alex Martelli
yoda <[EMAIL PROTECTED]> wrote:

> I'm using py.log for logging and I find that I end up having the following
> pattern emerge within my code (influenced by
> http://agiletesting.blogspot.com/2005/06/keyword-based-logging-with-py-lib
> rary.html):
> 
> def foo(**kwargs):
> log.foo(kwargs)
> #body form
> 
> This led me to believe that I could simplify that pattern with the
> following idiom :
> 
> 
> def logit (fn):
> '''
> decorator to enable logging of all tagged methods
> '''
> def decorator (**kwargs):
> # call a method named fn.func_name on log with kwargs
> #should be something like: log.func_name (kwargs)
> 
> return decorator

Assuming the attributes of object 'log' don't change at runtime (i.e.,
you're OK with early binding), I'd code:

def logit(fn):
method = getattr(log, fn.func_name)
def callit(**kwargs): return method(kwargs)
return callit

If you need to do late binding instead, you can move the getattr to
inside the body of callit.


Alex
 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Scanning a file

2005-10-29 Thread Alex Martelli
Bengt Richter <[EMAIL PROTECTED]> wrote:
   ...
> >>>while block:
> >>>block = block[-overlap:] + f.read(blocksize-overlap)
> >>>if block: yield block
   ...
> I was thinking this was an example a la Alex's previous discussion
> of interviewee code challenges ;-)
> 
> What struck me was
> 
>  >>> gen = byblocks(StringIO.StringIO('no'),1024,len('end?')-1)
>  >>> [gen.next() for i in xrange(10)]
>  ['no', 'no', 'no', 'no', 'no', 'no', 'no', 'no', 'no', 'no']

Heh, OK, I should get back into the habit of adding a "warning: untested
code" when I post code (particularly when it's late and I'm
jetlagged;-).  The code I posted will never exit, since block always
keeps the last overlap bytes; it needs to be changed into something like
(warning -- untested code!-)

if overlap>0:
  while True:
next = f.read(blocksize-overlap)
if not next: break
block = block[-overlap:] + next
yield block
else:
  while True:
next = f.read(blocksize)
if not next: break
yield next

(the if/else is needed to handle requests for overlaps <= 0, if desired;
I think it's clearer to split the cases rather than to test inside the
loop's body).


Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Scanning a file

2005-10-29 Thread Alex Martelli
Tim Roberts <[EMAIL PROTECTED]> wrote:
   ...
> >> print file("filename", "rb").read().count("\x00\x00\x01\x00")
> >
> >Funny you should say that, because I can't stand unnecessary one-liners.
> >
> >In any case, you are assuming that Python will automagically close the
> >file when you are done.
> 
> Nonsense.  This behavior is deterministic.  At the end of that line, the
> anonymous file object out of scope, the object is deleted, and the file is
> closed.

In today's implementations of Classic Python, yes.  In other equally
valid implementations of the language, such as Jython, IronPython, or,
for all we know, some future implementation of Classic, that may well
not be the case.  Many, quite reasonably, dislike relying on a specific
implementation's peculiarities, and prefer to write code that relies
only on what the _language_ specs guarantee.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: extracting numbers from a file, excluding fixed words

2005-10-29 Thread Alex Martelli
dawenliu <[EMAIL PROTECTED]> wrote:

> Hi, I have a file with this content:
> xxx xx x xxx
> 1
> 0
> 0
> 0
> 1
> 1
> 0
> (many more 1's and 0's to follow)
> y yy yyy yy y yyy
> 
> The x's and y's are FIXED and known words which I will ignore, such as
> "This is the start of the file" and "This is the end of the file".  The
> digits 1 and 0 have UNKNOWN length.  I want to extract the digits and
> store them in a file.  Any suggestions will be appreciated.

[[warning, untested code...]]

infile = open('infile.txt')
oufile = open('oufile.txt', 'w')
for line in infile:
if line.strip().isdigit(): oufile.write(line)
oufile.close()
infile.close()


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Recursive generators and backtracking search

2005-10-29 Thread Alex Martelli
Talin <[EMAIL PROTECTED]> wrote:

> even simpler - for examle, the idea of being able to return the output
> of one generator directly from another instead of having to iterate
> through all of the results and then re-yield them has already been
> discussed in this forum.

I missed those discussions, having been away from the group for awhile.
To me, the simplification of changing, e.g.,

for x in whatever_other_iterable: yield x

into (say)

yield from whatever_other_iterable

is minute and not worth changing the syntax (even though something like
'yield from' would mean no keywords would need to be added).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: lambda functions within list comprehensions

2005-10-29 Thread Alex Martelli
Max Rybinsky <[EMAIL PROTECTED]> wrote:
   ...
> >>> funcs = [lambda n: x * y / n for x, y in a]
   ...
> It seems, all functions have x and y set to 9.
> What's wrong with it? Is it a bug?

It's known as *late binding*: names x and y are looked up when the
lambda's body is executing, and at that time they're both set to the
value 9.  You appear to have expected *early binding*, with the names
being somehow looked up at the time the lambda keyword executed, but
that's just not Python semantics (and would interfere with many other
cases where late binding is exactly what one wants).

You've already indicated what's probably the best solution -- a factory
function instead of the lambda.  There are other ways to request early
binding, and since you appear to value compactness over clarity the most
compact way is probably:

funcs = [lambda n, x=x, y=y: x*y/n for x, y in a]

it's not perfect, because the resulting functions can take up to 3
arguments, so that if you called funcs[1](2,3) you'd get an unwanted
result rather than a TypeError exception.  If you're keen on getting the
exception in such cases, you can use a lambda factory in the same role
as the much clearer and more readable factory function you had (which I
keep thinking is the _sensible_ solution)...:

funcs = [ (lambda x,y: lambda n: x*y/n)(x,y) for x,y in a ]


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: lambda functions within list comprehensions

2005-10-29 Thread Alex Martelli
Max Rybinsky <[EMAIL PROTECTED]> wrote:

> Thank you for explanation, Alex.
> It appears that almost every beginner to Python gets in trouble with
> this ...feature. :)

Almost every beginner to Python gets in trouble by expecting "do what
I'm thinking of RIGHT NOW"-binding, which no language offers: in other
words, such beginners sometimes expect late binding where Python binds
early, and, vice versa, they at other times expect early binding where
Python binds late.  Not ALWAYS, mind you -- what they expect depends on
what would appear to them to be most convenient for their immediate
needs on each separate occasion.  Some other languages try to follow
beginners and offer "do what I mean" semantics -- when using such
languages, one ends up in a battle of wit against the compiler's guesses
about one's intentions.  Python instead offers extremely simple rules,
such as: any name is looked up each and every time it's evaluated (and
at no other times); evaluation of function headers happens completely at
the time the 'def' or 'lambda' evaluates, while evaluation of function
bodies happens completely at the time the function is _called_.  By
learning and applying such simple rules there can be no surprise about
what is evaluated (and, in particular, looked up) when.  E.g., consider
the difference between the following two functions:

def early(x=whatever()):
   ...

def late():
   x=whatever()
   ...

In 'early', the call to whatever() is part of the function's header, and
therefore happens at the time the 'def' statement executes -- and thus
name 'whatever' means whatever it means at THAT time (if at that time
it's not bound to anything, the 'def' statement fails with an
exception).

In 'late', the call to whatever() is part of the function's body, and
therefore happens each time the function is called -- and thus name
'whatever' means whatever it means at THAT time (if at that time it's
not bound to anything, the call fails with an exception).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Scanning a file

2005-10-29 Thread Alex Martelli
Paul Watson <[EMAIL PROTECTED]> wrote:

> "Alex Martelli" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
> 
> > In today's implementations of Classic Python, yes.  In other equally
> > valid implementations of the language, such as Jython, IronPython, or,
> > for all we know, some future implementation of Classic, that may well
> > not be the case.  Many, quite reasonably, dislike relying on a specific
> > implementation's peculiarities, and prefer to write code that relies
> > only on what the _language_ specs guarantee.
> 
> How could I identify when Python code does not close files and depends on
> the runtime to take care of this?  I want to know that the code will work
> well under other Python implementations and future implementations which may
> not have this provided. 

Then you should use try/finally (to have your code run correctly in all
of today's implementations; Python 2.5 will have a 'with' statement to
offer nicer syntax sugar for that, but it will be a while before all the
implementations get around to adding it).

If you're trying to test your code to ensure it explicitly closes all
files, you could (from within your tests) rebind built-ins 'file' and
'open' to be a class wrapping the real thing, and adding a flag to
remember if the file is open; at __del__ time it would warn if the file
had not been explicitly closed.  E.g. (untested code):

import __builtin__
import warnings

_f = __builtin__.file
class testing_file(_f):
  def __init__(self, *a, **k):
 _f.__init__(self, *a, **k)
self._opened = True
  def close(self):
_f.close(self)
self._opened = False
  def __del__(self):
if self._opened:
   warnings.warn(...)
   self.close()

__builtin__.file = __builtin__.open = testing_file



Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Scanning a file

2005-10-29 Thread Alex Martelli
Steven D'Aprano <[EMAIL PROTECTED]> wrote:
   ...
> I should also point out that for really serious work, the idiom:
> 
> f = file("parrot")
> handle(f)
> f.close()
> 
> is insufficiently robust for production level code. That was a detail I
> didn't think I needed to drop on the original newbie poster, but depending
> on how paranoid you are, or how many exceptions you want to insulate the
> user from, something like this might be needed:
> 
> try:
> f = file("parrot")
> try:
> handle(f)
> finally:
> try:
> f.close()
> except:
> print "The file could not be closed; see your sys admin."
> except:
> print "The file could not be opened."

The inner try/finally is fine, but both the try/except are total, utter,
unmitigated disasters: they will hide a lot of information about
problems, let the program continue in a totally erroneous state, give
mistaken messages if handle(f) causes any kind of error totally
unrelated to opening the file (or if the user hits control-C during a
lengthy run of handle(f)), emit messages that can erroneously end up in
the redirected stdout of your program... VERY, VERY bad things.

Don't ever catch and ``handle'' exceptions in such ways.  In particular,
each time you're thinking of writing a bare 'except:' clause, think
again, and you'll most likely find a much better approach.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Expanding Python as a macro language

2005-10-29 Thread Alex Martelli
<[EMAIL PROTECTED]> wrote:
...
> For the other Alex observations (about Mac OsX and my examples of
> automation centered on web automation) I have a PC, and the fact that
> Python is very good at dealing with the web, doesn't help too much
> in this case...

All of your sensible use cases were about the Web; the fact that you
have a PC is irrelevant to this observation.

> In any case a macro language like AutoIt is a general purpose
> application.

"General purpose" is, I believe, an overbid here -- if what you're doing
is simulating keystrokes, you're limited to the purposes that can in
fact be achieved through keystroke-simulation.  And I've pointed you to
an application that can run under Linux to simulate keystrokes (which,
of course, applications are able to detect as being simulated, if they
want to implement very strong security on this plane).  What, exactly,
is stopping you from using Python to drive that application, if the
simulation of keystrokes is the pinnacle of your heart's desire?

If what you're whining about is that Linux applications can be built to
be secure (rejecting fake keystrokes, among other things), then stick
with Windows and its endless flood of malware.  If you want security,
don't complain about the existence of security.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Automatic binding of **kwargs to variables

2005-10-29 Thread Alex Martelli
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
   ...
> def foo(**kwargs):
> expected_form1_kwargs = ["arg1", "arg2"]
> 
> for name in expected_form1_kwargs:
> if name not in kwargs:
> kwargs[name]=None
> 
> for name in kwargs:
> if name in kwargs and name not in expected_form1_kwargs:
> raise ValueError, "Unrecognized keyword: " + name
> 
> print kwargs

I find this style of coding repulsive when compared to:

def foo(arg1=None, arg2=None):
print dict(arg1=arg1, arg2=arg2)

I don't understand what added value all of those extra, contorted lines
are supposed to bring to the party.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Scanning a file

2005-10-30 Thread Alex Martelli
Steven D'Aprano <[EMAIL PROTECTED]> wrote:
   ...
> > Don't ever catch and ``handle'' exceptions in such ways.  In particular,
> > each time you're thinking of writing a bare 'except:' clause, think
> > again, and you'll most likely find a much better approach.
> 
> What would you -- or anyone else -- recommend as a better approach?

That depends on your application, and what you're trying to accomplish
at this point.


> Is there a canonical list somewhere that states every possible exception
> from a file open or close?

No.  But if you get a totally unexpected exception, something that shows
the world has gone crazy and most likely any further action you perform
would run the risk of damaging the user's persistent data since the
macchine appears to be careening wildly out of control... WHY would you
want to perform any further action?  Crashing and burning (ideally
leaving as detailed a core-dump as feasible for later post-mortem)
appears to be preferable.  (Detailed information for post-mortem
purposes is best dumped in a sys.excepthook handler, since wild
unexpected exceptions may occur anywhere and it's impractical to pepper
your application code with bare except clauses for such purposes).

Obviously, if your program is so life-crucial that it cannot be missing
for a long period of time, you will have separately set up a "hot spare"
system, ready to take over at the behest of a separate monitor program
as soon as your program develops problems of such magnitude (a
"heartbeat" system helps with monitoring).  You do need redundant
hardware for that, since the root cause of unexpected problems may well
be in a hardware fault -- the disk has crashed, a memory chip just
melted, the CPU's on strike, locusts...!  Not stuff any program can do
much about in the short term, except by switching to a different
machine.


Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >