>> One example is performing a series of transformations on a collection of >> data, with the intent of finding an element of that collection that >> satisfies a particular criterion. If you separate out the individual >> transformations, you need to understand generators or you will waste >> space and perform many unnecessary calculations. If you only ever do a >> single transformation with a clear conceptual meaning, you could create >> a "master transformation function," but what if you have a large number >> of potential permutations of that function? > > I'm sorry, that is far too abstract for me. Do you have a *concrete* > example, even an trivial one?
How about a hypothetical log analyzer that parses a log file that is aggregated from multiple event sources with disparate record structures. You will need to perform a series of transformations on the data to convert record elements from text to specific formats, and your function for identifying "alarm" records is also dependent on record structure (and possibly system state, imagining an intrusion detection system). Of course you could go through a lot of trouble to dispatch and detect alarms over 6-7 statements, however given the description "for each log record you receive, convert text elements to native data types based on the value of the first three fields of the record, then trigger an alert if that record meets defined requirements" and assuming you have maps from record values to conversion functions for record elements, and a map from record types to alert criteria functions for record types already constructed, it seems like a one liner to me. >> What if you are composing >> three or four functions, each of which is conditional on the data? If >> you extract things from a statement and assign them somewhat arbitrary >> names, you've just traded horizontal bloat for vertical bloat (with a >> net increase in volume), while forcing a reader to scan back and forth >> to different statements to understand what is happening. > > First off, vertical bloat is easier to cope with than horizontal bloat, > at least for people used to reading left-to-right rather than vertically. > There are few anti-patterns worse that horizontal scrolling, especially > for text. I agree that if a line goes into horizontal scroll buffer, you have a problem. Of course, I often rail on parenthesized function-taking-arguments expression structure for the fact that it forces you to read inside out and right to left, and I'd prefer not to conflate the two issues here. My assertion is that given an expression structure that reads naturally regardless, horizontal bloat is better than larger vertical bloat, in particular when the vertical bloat does not fall along clean semantic boundaries. > Secondly, the human brain can only deal with a limited number of tokens > at any one time. It's easier to remember large numbers when they are > broken up into chunks: > > 824-791-259-401 versus 824791259401 > > (three tokens, versus twelve) > > Likewise for reading code. Chunking code into multiple lines instead of > one long expression, and temporary variables, make things easier to > understand, not harder. This is true, when the tokens are an abstraction. I read some of the research on chunking, basically it came down to people being able to remember multiple numbers efficiently in an auditory fashion using phonemes. Words versus random letter combinations have the same effect, only with visual images (which is why I think Paul Graham is full of shit with regards to his "shorter is better than descriptive" mantra in old essays). This doesn't really apply if storing the elements in batches doesn't provide a more efficient representation. Of course, if you can get your statements to read like sensible English sentences, there is definitely a reduction in cognitive load. > And thirdly, you're not "forcing" the reader to scan back and forth -- or > at least if you are, then you've chosen your functions badly. Functions > should have descriptive names and should perform a meaningful task, not > just an arbitrary collection of code. This is why I quoted Einstein. I support breaking compound logical statements down to simple statements, then combining those simple statements. The problem arises when your compound statement still looks like "A B C D E F G H I J K L M N", and portions of that compound statement don't have a lot of meaning outside the larger statement. You could say X = A B C D E, Y = F G H I J, Z = K L M N, then say X Y Z, but now you've created bloat and forced the reader to backtrack. > > When you read: > > x = range(3, len(sequence), 5) > > you're not forced to scan back and forth between that line and the code > for range and len to understand it, because range and len are good > abstractions that make sensible functions. > > There is a lot of badly written code in existence. Don't blame poor > execution of design principles on the design principle itself. I like to be fair and even handed, and I recognize that tool and language creators don't control users. At the same time, it is a fundamental truth that people are much more inclined to laziness and ignorance than their complements. Any exceptional design will recognize this and make doing the right thing the intuitive, expedient choice. From this perspective I feel morally obligated to lay some blame at the feet of language or tool creator when a person misuses their creation in a way easily predicted given his or her nature. > [...] >> Also, because of Sphinx, it is very common in the Python community weave >> documents and code together in a way that is convenient for authors but >> irritating for readers. > > I don't know about "very common". I suspect, given the general paucity of > documentation in the average software package, it is more like "very > rare, but memorable when it does happen". Well, as a textbook example, the Pocoo projects tend to this. FormAlchemy is another offender. This is just off the top of my head. >> I personally would prefer not to have to scroll >> past 100 lines of a tutorial with examples, tests and what not in order >> to go from one function to another. > > Agreed. Docstrings should use a minimal number of examples and tests. > Tutorials and extensive tests should be extracted into external documents. I don't even mind having tutorials and doctests in the same file, I just don't like them to be interleaved. I really like module level docstrings, and class docstrings are sometimes useful; it is function docstrings that I usually find irritating. -- http://mail.python.org/mailman/listinfo/python-list