Steven D'Aprano <[EMAIL PROTECTED]> wrote: > > image = Image( ) > > Now you have an "image" object. What is it? > > Answer: it isn't an image at all, not in the plain English sense. (Or if > it is, it is an arbitrary "default image" picked by the class designer.)
No doubt (presumably some kind of as-yet blank one). > > image.read( "myfile.jpg" ) > > And now, at long last, the image object actually is an image. So why make > this a two step process? Whatever the Image() initialization does, why > can't it be done automatically when you read the file? "Two-step construct" ("2SC") is a reasonably well-known and widely useful idiom, and it can serve several kinds of purposes, many but not all of which are tied to persistence. For example, to some extent it's used in Entity Enterprise Javabeans (EJBs): with CMP, instead of having to create a separate object each time you need a new record from the database, the container can juggle a pool of objects and reuse each of them to hold different records at different times, so the number of beans needed is the number of different records you need to have _simultaneously_. The "thread pool" concept can be implemented in a similar way, although this is less common. Another field where 2SC is often used is GUIs; in that case the main motivation is "impedence matching" between a GUI toolkit (often a cross-platform one) and a given platform's underlying toolkit. Python offers good opportunities to implement 2SC "under the covers" thanks to the split between __new__ and __init__: indeed one can get tricky, since __new__ need not necessarily perform the first step of 2SC (it might well return a "scrubbed" object from the pool rather than the new one). Unpickling may normally use __setstate__ instead of __init__ (after __new__, anyway) -- that's more flexible and often easier to arrange than going through getinitargs (which doesn't work for newstyle classes anyway) or getnewargs (which doesn't work for classic ones). Of course, the way OOP is normally taught, 2SC sounds like a heresy, but "out in the real world" it does have some advantages (even though single-step construction remains the normal approach in most cases). > But if the class has no natural default state, then it makes no sense to > create an "empty object" with no data, a "non-image image" so to speak. Hmmm, it might, actually; that's what __new__ normally does for instances of mutable classes, so that __init__ or __setstate__ or other different methods yet, as appropriate, can then make the object "nonempty" in different ways. > In other words, if you find yourself writing methods like this: > > class Klass: > def foo(self): > if self.data is None: > raise KlassError("Can't foo an uninitialized Klass object.") > else: > # do something > > then you are just doing pointless make-work to fit a convention that > doesn't make sense for your class. It does appear to be a code smell, yes. But protocol semantics may often require such constraints as "it does not make sense to call x.A unless x.B has been previously called" or viceversa "it's forbidden to call x.A if x.B has already been called", and such constraints are generally implemented through something like the above idiom. Consider a file object, for example: after you call f.close() any other method call must raise an error. The simplest, most natural way to implement this is very similar to what you just coded, using a self.closed flag -- a case of "two-step destruction", separating termination ("close") from finalization (destruction proper). Reusable objects, that can be born "scrubbed", loaded with some data and used for a while, then "scrubbed" again (with the data getting persisted off and the object surviving and going into a reuse-pool), typically need a flag to know if they're scrubbed or not (or, some other data member may play double duty, e.g. by being None iff an object is scrubbed). All normal methods must then raise if called on a scrubbed object; code for that purpose may be injected by decorators or a custom metaclass to reduce the boilerplate and code smells. Note that I'm not defending the OP's contention -- I've seen no reason in his post making 2SC/2SD desirable. I'm just addressing the wider issue... one I probably wouldn't even cover in "OOP 101", but hold for a later course, e.g. one on the architecture of persistence frameworks. Alex -- http://mail.python.org/mailman/listinfo/python-list