On 2009-10-31 16:53 PM, Peng Yu wrote:
On Sat, Oct 31, 2009 at 4:14 PM, Robert Kern<robert.k...@gmail.com> wrote:
On 2009-10-31 15:31 PM, Peng Yu wrote:
The original problem comes from the maintenance of the package. When A
and B are large classes, it is better to put them in separate files
under the directory 'test' than put them in the file 'test.py'. The
interface 'test.A' is used by end users. However, there will be a
problem if 'import test' is used for developers, because both A and B
are imported, which cause dependence between A and B. For example,
during the modification of B (not finished), 'import A' would not
work. This is means that modifications of A and B are not independent,
which cause a lot of problem when maintaining the package.
To be frank, that development process is going to cause you a lot of
problems well beyond these import entanglements. Developers should have
their own workspace! They shouldn't push things into production until the
system is working. Checking something into source control shouldn't
automatically deploy things into production.
I don't quite agree with your opinion. But please don't take it too personaly.
Even in the developer's work space, it is possible to change multiple
classes simultaneously. So the import entanglement problem still
exists.
But it's a problem that should have different consequences than you are
claiming. Having users prevented from using A because developers are modifying
their copy of B in production is a problem that needs to be solved by changing
your development process. If you don't change your development process, you will
run into the same problems without import entanglements.
Now as to import entanglements in the developer's workspace, it is true that
they can cause issues from time to time, but they are much, much smaller in
practice. I can just go in and comment out the offending import temporarily
while I finish working on the other part until I'm ready to address both of them
together. Then when I'm finished and things are working again, I can check my
code into source control. It's just not a big deal.
Naming the filename different from the class is a solution, but it is
a little bit annoying.
I'm wondering how people handle this situation when they have to
separate a module into multiple modules.
Even if we organize things along the lines of "one class per module", we use
different capitalization conventions for modules and classes. In part, this
helps solve your problem, but it mostly saves the developer thought-cycles
from having to figure out which you are referring to when reading the code.
I know that multiple classes or functions are typically defined in one
file (i.e. module in python). However, I feel this make the code not
easy to read. Therefore, I insist on one class or function per file
(i.e module in python).
One function per file is a little extreme. I am sympathetic to "one class per
module", but functions *should* be too short too warrant a module to themselves.
When one class per module is strictly enforced, there will be no need
to have different capitalization conventions for modules and classes.
Developers should be able to tell whether it is a class or a module
from the context.
Given enough brain-time, but you can make your code easier to read by using
different conventions for different things. Developer brain-time is expensive!
As much as possible, it should be spent on solving problems, not comprehension.
In my question, module A and B exist just for the sake of
implementation. Even if I have module A and B, I don't want the user
feel the existence of module A and B. I want them feel exact like
class A and B are defined in module 'test' instead of feeling two
modules A and B are in package 'test'. I know that module names should
be in lower cases, in general. However, it is OK to have the module
name capitalized in this case since the end users don't see them.
In C++, what I am asking can be easily implemented, because the
namespace and the directory hierachy is not bind to each other.
However, the binding between the namespace and the directory hierachy
make this difficult to implement. I don't know if it is not
impossible, but I'd hope there is a way to do so.
I'm not sure that C++ is a lot better. I still have to know the file hierarchy
in order to #include the right files. Yes, the namespaces get merged when you go
to reference things in the code, but those #includes are intimately tied to the
file hierarchy.
In C++, you can often #include one file that #includes everything else because
linking won't bring in the symbols you don't actually use. Oddly enough, we
don't have that luxury because we are in a dynamic language. Python imports have
runtime consequences because there is no compile or link step. You can't think
of import statements as #include statements and need to use different patterns.
Of course, to really take advantage of that feature in C++ requires some careful
coding and use of patterns like pimpl. That often negates any readability benefits.
You could probably hack something (and people have), but it makes your code
harder to understand because it is non-standard.
Personally, I like to keep my __init__.py files empty such that I can import
exactly what I need from the package. This allows me to import exactly the
module that I need. In large packages with extension modules that can be
expensive to load, this is useful. We usually augment this with an api.py
that exposes the convenient "public API" of the package, the A and B classes
in your case.
I looked at python library, there are quite a few __init__.py files
are not empty. In fact, they are quite long. I agree with you that
'__init__.py' should not be long. But I'm wondering why in python
library __init__.py are quite long.
For the most part, it's just not an issue. If you are seeing serious problems,
this may just be exposing deeper issues with your code and your process that
will come to bite you in other contexts sooner or later.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
--
http://mail.python.org/mailman/listinfo/python-list