From: Tutor <tutor-bounces+sjeik_appie=hotmail....@python.org> on behalf of Peter Otten <__pete...@web.de> Sent: Monday, August 27, 2018 6:43 PM To: tutor@python.org Subject: Re: [Tutor] need help generating table of contents
Albert-Jan Roskam wrote: > > From: Tutor <tutor-bounces+sjeik_appie=hotmail....@python.org> on behalf > of Peter Otten <__pete...@web.de> Sent: Friday, August 24, 2018 3:55 PM > To: tutor@python.org > <snip> >> The following reshuffle of your code seems to work: >> >> print('\r\n** Table of contents\r\n') >> pattern = '/Title \((.+?)\).+?/Page ([0-9]+)(?:\s+/Count ([0-9]+))?' >> >> def process(triples, limit=None, indent=0): >> for index, (title, page, count) in enumerate(triples, 1): >> title = indent * 4 * ' ' + title >> print(title.ljust(79, ".") + page.zfill(2)) >> if count: >> process(triples, limit=int(count), indent=indent+1) >> if limit is not None and limit == index: >> break >> >> process(iter(re.findall(pattern, toc, re.DOTALL))) > > Hi Peter, Cameron, > > Thanks for your replies! The code above indeeed works as intended, but: I > don't really understand *why*. I would assign a name to the following line > "if limit is not None and limit == index", what would be the most > descriptive name? I often use "is_*" names for boolean variables. Would > "is_deepest_nesting_level" be a good name? > No, it's not necessarily the deepest level. Every subsection eventually ends > at this point; so you might call it reached_end_of_current_section > > Or just 'limit' ;) LOL. Ok, now I get it :-) > The None is only there for the outermost level where no /Count is provided. > In this case the loop is exhausted. > > If you find it is easier to understand you can calculate the outer count aka > limit as the number of matches - sum of counts: > <snip useful info> >> Also, I don't understand why iter() is required here, and why finditer() > >is not an alternative. >finditer() would actually work -- I didn't use it because I wanted to make > as few changes as possible to your code. What does not work is a list like >the result of findall(). This is because the inner for loops (i. e. the ones >in the nested calls of process) are supposed to continue the iteration >instead of restarting it. A simple example to illustrate the difference: Ah, the triples cannot be unpacked inside the "for" line of the loop. This works: def process(triples, limit=None, indent=0): for index, triple in enumerate(triples, 1): title, page, count = triple.groups() # unpack it here title = indent * 4 * ' ' + title print(title.ljust(79, ".") + page.zfill(2)) if count: process(triples, limit=int(count), indent=indent+1) if limit is not None and limit == index: break process(re.finditer(pattern, toc, re.DOTALL)) If I don't do this, I get this error: File "Q:/toc/toc.py", line 64, in <module> process(re.finditer(pattern, toc, re.DOTALL)) File "Q:/Ctoc/toc.py", line 56, in process for index, (title, page, count) in enumerate(triples, 1): TypeError: '_sre.SRE_Match' object is not iterable Process finished with exit code 1 Thanks again Peter! Very insightful! Albert-Jan _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor