<[{SYNOPSIS: Many good answers. I am satisfied and we can move on.}]>
Steven, I appreciate the many useful suggestions. Many of them are what I already do. Some are in tension with other considerations. Yes, it can be shorter and more efficient to not keep saying module.this.that.something versus something like: >From module.this.that import something as myname. Of course you do that with care as you want to be careful about pulling too many things into collisions in one namespace. Longer more descriptive names are encouraged. Based on reading quite a bit of code lately, I do see how common it is to try to shorten names while not polluting the namespace as in the nearly universal: Import numpy as np, pandas as pd The places I like to wrap lines tend to be, in reality, the places python tolerates it. If you use a function that lets you set many options, it is nice to see the options one per line. Since the entire argument list is in parentheses, that works. Ditto for creating lists, sets and dictionaries with MANY items at once. There are cases where it may make sense to have a long like connected by AND or OR given how python does short-circuiting while returning the last thing or two it touched instead of an actual True/False. For example, you may want to take the first available queue that is not empty with something like this: Using = A or B or C or ... or Z Handling = Using.pop() Sure, that could be rewritten into multiple lines. I won't get sucked into a PERL discussion except to say that some people love to write somethings so obscure they won't recognize it even a daylater. PERL makes that very easy. I have done that myself a few times as I was an early user. Python may claim to be straightforward but I can easily see ways to fool people in python too with dunder methods or function closures or decorators or ... All in all, I think my question has been answered. I will add one more concept. I recently wrote some code and ran into error messages on lines I was trying to keep short: A = 'text" A += "more text" A += object A+= ... At one point, I decided to use a formatted string instead: A = f"...{...}...{...}..." Between curly braces I could insert variables holding various strings. As long as those names were not long, and with some overhead, the line of code was of reasonable size even if it expanded to much more. -----Original Message----- From: Tutor <tutor-bounces+avigross=verizon....@python.org> On Behalf Of Steven D'Aprano Sent: Thursday, December 13, 2018 7:27 PM To: tutor@python.org Subject: Re: [Tutor] Long Lines techniques On Thu, Dec 13, 2018 at 12:36:27PM -0500, Avi Gross wrote: > Simple question: > > When lines get long, what points does splitting them make sense and > what methods are preferred? Good question! First, some background: Long lines are a potential code smell: a possible sign of excessively terse code. A long line may be a sign that you're doing too much in one line. https://martinfowler.com/bliki/CodeSmell.html http://wiki.c2.com/?CodeSmell https://blog.codinghorror.com/code-smells/ Related: https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/ Note that merely splitting a logical line over two or more physical lines may still be a code-smell. Sure, your eyes don't get as tired reading fifteen lines of 50 characters each, compared to a single 750 character line, but there's just as much processing going on in what is essentially a single operation. Long lines are harder to read: your eyes have to scan across a long line, and beyond 60 or 70 characters, it becomes physically more difficult to scan across the line, and the error rate increases. [Citation required.] But short lines don't include enough information, so the traditional compromise is 80 characters, the character width of the old-school green-screen terminals. The Python standard library uses 79 characters. (The odd number is to allow for scripts which count the newline at the end of the line as one of the 80.) https://www.python.org/dev/peps/pep-0008/ Okay, so we have a style-guide that sets a maximum line length, whether it is 72 or 79 or 90 or 100 characters. What do you do when a line exceeds that length? The only firm rule is that you must treat each case on its own merits. There is no one right or wrong answer. Every long line of code is different, and the solution will depend on the line itself. There is no getting away from human judgement. (1) Long names. Do you really need to call the variable "number_of_characters" when "numchars" or even "n" would do? The same applies to long function names: "get_data_from_database" is probably redundant, "get_data" will probably do. Especially watch out for long dotted names that you use over and over again. Unlike static languages like Java, each dot represents a runtime lookup. Long names like: package.subpackage.module.object.method requires four lookups. Look for oportunities to make an alias for a long name and avoid long chains of dots: for item in sequence: do_something_with(package.subpackage.module.object.method(arg, item)) can be refactored to: method = package.subpackage.module.object.method for item in sequence: do_something_with(method(arg, item)) and is both easier to read and more efficient. A double win! (2) Temporary constants: sometimes it is good enough to just introduce a simple named constant used once. The cognitive load is low if it is defined immediately before it is used. Instead of the long line: raise ValueError("expected a list, string, dict or None, but instead got '%s'" % type(value).__name__) I write: errmsg = "expected a list, string, dict or None, but instead got '%s'" raise ValueError(errmsg % type(value).__name__) (3) Code refactoring. Maybe that long line is sign that you need to add a method or function? Especially if you are using that line, or similar, in multiple places. But refactoring is justified even if you use the line *once* if it is complicated enough. Likewise, sometimes it is helpful to factor out separate sub-expressions onto their own lines, using their own variables, rather than doing everything in a single, complicated, expression. Psychologists, educators and linguists call this "chunking", and it is often very helpful for simplifying complicated ideas, sentences and expressions. The lack of chunks is why long Perl one-liners are so inpenetrable. (4) Split the long logical line over multiple physical lines. This does nothing to reduce the inherent complexity of the line, but if that's fairly low to start with, it is often helpful. Python gives us two ways to split a logical line over multiple physical lines: a backslash at the end of the line, and brackets of any sort. The preferred way is to use round brackets for grouping: result = (some very long expression which can be split over many lines) This is especially useful with function calls: result = function(first_argument, second_argument, third_argument, fourth_argument) If you are building a list or dict literal, there is no need for the parentheses, as square and curly brackets have the same effect. That's especially useful with two-dimensional nested lists: data = [[row, one, with, many, items], [row, two, with, many, items], [row, three, with, many, items]] For long strings, I like to use *implicit string concatentation*. String literals which are separated by nothing except whitespace are concatenated at compile-time. So I can write a long string like this: long_string = ("this is a very long string which doesn't" " fit on a single line but isn't appropriate" " for a triple-quoted string") Notice that I split the string at word breaks, and move the space to the beginning of the physical line rather than the end. I find that I'm less likely to forget the space if I put it at the start of the line rather than the end. Not preferred, but allowed for backwards compatibility and still very occasionally useful, is to end the line with a bare backslash. I find it helpful in conjunction with triple quoted strings: text = """\ body of the string is aligned including the first line """ but otherwise the backslash is problematic and error-prone. It must be *immediately* followed by a newline, if you accidentally add a space after the backslash it won't work. And finally: (5) Its just a style guide, not a law of physics. As Douglas Bader once said, "Rules are for the guidance of the wise and the obedience of fools." See also Raymond Hettinger's talk "Beyond PEP 8": https://twitter.com/raymondh/status/589849947408703488 https://medium.com/@drb/pep-8-beautiful-code-and-the-tyranny-of-guidelines-f 96499f5ac17 Better to go two or three characters beyond the maximum length than to make the code ugly. [...] > There are places you can break lines as in a comprehension such as this set > comprehension: > > letter_set = { letter > for word in (left_list + right_list) > for letter in word } > > The above is an example where I know I can break because the {} is holding > it together. I know I can break at each "for" or "if" but can I break at > random places? Not quite random, you can't break in the middle of a word, but you can break between words. [...] > I will stop here with saying that unlike many languages, parentheses must be > used with care in python as they may create a tuple or even generator > expression. But not by accident. You can't create a generator expression by accident by wrapping an arbitrary expression in round brackets, or turn a expression into a tuple. Remember, it isn't the parentheses which make tuples, its the commas. Except for the empty tuple special case, (), the parens are ALWAYS just there to either group the tuple so as to avoid ambiguity, or to visually emphasize that it is a tuple even if the interpreter doesn't need the hint. -- Steve _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor