Albert-Jan Roskam wrote: > > From: Tutor <tutor-bounces+sjeik_appie=hotmail....@python.org> on behalf > of Peter Otten <__pete...@web.de> Sent: Friday, August 24, 2018 3:55 PM > To: tutor@python.org > <snip> >> The following reshuffle of your code seems to work: >> >> print('\r\n** Table of contents\r\n') >> pattern = '/Title \((.+?)\).+?/Page ([0-9]+)(?:\s+/Count ([0-9]+))?' >> >> def process(triples, limit=None, indent=0): >> for index, (title, page, count) in enumerate(triples, 1): >> title = indent * 4 * ' ' + title >> print(title.ljust(79, ".") + page.zfill(2)) >> if count: >> process(triples, limit=int(count), indent=indent+1) >> if limit is not None and limit == index: >> break >> >> process(iter(re.findall(pattern, toc, re.DOTALL))) > > Hi Peter, Cameron, > > Thanks for your replies! The code above indeeed works as intended, but: I > don't really understand *why*. I would assign a name to the following line > "if limit is not None and limit == index", what would be the most > descriptive name? I often use "is_*" names for boolean variables. Would > "is_deepest_nesting_level" be a good name?
No, it's not necessarily the deepest level. Every subsection eventually ends at this point; so you might call it reached_end_of_current_section Or just 'limit' ;) The None is only there for the outermost level where no /Count is provided. In this case the loop is exhausted. If you find it is easier to understand you can calculate the outer count aka limit as the number of matches - sum of counts: def process(triples, section_length, indent=0): for index, (title, page, count) in enumerate(triples, 1): title = indent * 4 * ' ' + title print(title.ljust(79, ".") + page.zfill(2)) if count: process(triples, section_length=int(count), indent=indent+1) if section_length == index: break triples = re.findall(pattern, toc, re.DOTALL) toplevel_section_length = ( len(triples) - sum(int(c or 0) for t, p, c in triples) ) process(iter(triples), toplevel_section_length) Just for fun here's one last variant that does away with the break -- and thus the naming issue -- completely: def process(triples, limit=None, indent=0): for title, page, count in itertools.islice(triples, limit): title = indent * 4 * ' ' + title print(title.ljust(79, ".") + page.zfill(2)) if count: process(triples, limit=int(count), indent=indent+1) Note that islice(items, None) does the right thing: >>> list(islice("abc", None)) ['a', 'b', 'c'] > Also, I don't understand why iter() is required here, and why finditer() > is not an alternative. finditer() would actually work -- I didn't use it because I wanted to make as few changes as possible to your code. What does not work is a list like the result of findall(). This is because the inner for loops (i. e. the ones in the nested calls of process) are supposed to continue the iteration instead of restarting it. A simple example to illustrate the difference: >>> s = "abcdefg" >>> for k in range(3): ... print("===", k, "===") ... for i, v in enumerate(s): ... print(v) ... if i == 2: break ... === 0 === a b c === 1 === a b c === 2 === a b c >>> s = iter("abcdefg") >>> for k in range(3): ... print("===", k, "===") ... for i, v in enumerate(s): ... print(v) ... if i == 2: break ... === 0 === a b c === 1 === d e f === 2 === g _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor