Hi Peter, I'll try to comment the code below to verify if I understood it correctly or missing some major parts. Comments are just below code with the intent to let you read the code first and my understanding afterwards.
Peter Otten <__pete...@web.de> wrote: [] > $ cat parse_column_tree.py > import csv > > def column_index(row): > for result, cell in enumerate(row, 0): > if cell: > return result > raise ValueError Here you get the depth of your first node in this row. > class Node: > def __init__(self, name, level): > self.name = name > self.level = level > self.children = [] > > def append(self, child): > self.children.append(child) > > def __str__(self): > return "\%s{%s}" % (self.level, self.name) Up to here everything is fine, essentially defining the basic methods for the node object. A node is represented univocally with its name and the level. Here I could say that two nodes with the same name cannot be on the same level but this is cosmetic. The important part would be that 'Name' can be also 'Attributes', with a dictionary instead. This would allow to store more information on each node. > def show(self): > yield [self.name] Here I'm lost in translation! Why using yield in the first place? What this snippet is used for? > for i, child in enumerate(self.children): > lastchild = i == len(self.children)-1 > first = True > for c in child.show(): > if first: > yield ["\---> " if lastchild else "+---> "] + c > first = False > else: > yield [" " if lastchild else "| "] + c Here I understand more, essentially 'yield' returns a string that would be used further down in the show(root) function. Yet I doubt that I grasp the true meaning of the code. It seems those 'show' functions have lots of iterations that I'm not quite able to trace. Here you loop over children, as well as in the main()... > def show2(self): > yield str(self) > for child in self.children: > yield from child.show2() ok, this as well requires some explanation. Kinda lost again. From what I can naively deduce is that it is a generator that returns the str defined in the node as __str__ and it shows it for the whole tree. > def show(root): > for row in root.show(): > print("".join(row)) > > def show2(root): > for line in root.show2(): > print(line) Here we implement the functions to print a node, but I'm not sure I understand why do I have to iterate if the main() iterates again over the nodes. > > def read_tree(rows, levelnames): > root = Node("#ROOT", "#ROOT") > old_level = 0 > stack = [root] > for i, row in enumerate(rows, 1): I'm not quite sure I understand what is the stack for. As of now is a list whose only element is root. > new_level = column_index(row) > node = Node(row[new_level], levelnames[new_level]) here you are getting the node based on the current row, with its level. > if new_level == old_level: > stack[-1].append(node) I'm not sure I understand here. Why the end of the list and not the beginning? > elif new_level > old_level: > if new_level - old_level != 1: > raise ValueError here you avoid having a node which is distant more than one level from its parent. > stack.append(stack[-1].children[-1]) here I get a crash: IndexError: list index out of range! > stack[-1].append(node) > old_level = new_level > else: > while new_level < old_level: > stack.pop(-1) > old_level -= 1 > stack[-1].append(node) Why do I need to pop something from the stack??? Here you are saying that if current row has a depth (new_level) that is smaller than the previous one (old_level) I decrement by one the old_level (even if I may have a bigger jump) and pop something from the stack...??? > return root once filled, the tree is returned. I thought the tree would have been the stack, but instead is root...nice surprise. > > def main(): [strip arg parsing] > with open(args.infile) as f: > rows = csv.reader(f) > levelnames = next(rows) # skip header > tree = read_tree(rows, levelnames) filling the tree with the data in the csv. > > show_tree = show2 if args.latex else show > for node in tree.children: > show_tree(node) > print("") It's nice to define show_tree as a function of the argument. The for loop now is more than clear, traversing each node of the tree. As I said earlier in the thread there's a lot of food for a newbie, but better going through these sort of exercises than dumb tutorial which don't teach you much. Al -- https://mail.python.org/mailman/listinfo/python-list