So I am trying to build a binary tree hierarchy given numerical elements serving for its leaves (last level of the tree to build). From the leaves I want to randomly create a name for the higher level of the hierarchy and assign it to the children elements. For example: if the elements inputted are `0,1,2,3` then I would like to create firstly 4 elements (say by random giving them a label composed of a letter and a number) then for the second level (iteration) I assign each of 0,1 to a random name label (e.g. `b1`) and `2,3` to another label (`b2`) then for the third level I assign a parent label to each of `b1` and `b2` as `c1`.
An illustation of the example is the following tree: [image: tree_exp.PNG] For this I use numpy's `array_split()` to get the chunks of arrays based on the iteration needs. for example to get the first iteration arrays I use `np.array_split(input, (input.size // k))` where `k` is an even number. In order to assign a parent node to the children the array range should enclose the children's. For example to assign the parent node with label `a1` to children `b1` and `b2` with range respectively [0,1] and [2,3], the parent should have the range [0,3]. All is fine until a certain iteration (k=4) returns parent with range [0,8] which is overlapping to children ranges and therefore cannot be their parent. My question is how to evenly partition such arrays in a binary way and create such binary tree so that to obtain for k=4 the first range to be [0,7] instead of [0,8]? My code is the following: #!/usr/bin/python # -*- coding: utf-8 -*- import string import random import numpy as np def generate_numbers_list_until_number(stop_number): if str(stop_number).isnumeric(): return np.arange(stop_number) else: raise TypeError('Input should be a number!') def generate_node_label(): return random.choice(string.ascii_lowercase) \ + str(random.randint(0, 10)) def main(): data = generate_numbers_list_until_number(100) k = 1 hierarchies = [] cells_arrays = np.array_split(data, data.size // k) print cells_arrays used_node_hierarchy_name = [] node_hierarchy_name = [generate_node_label() for _ in range(0, len(cells_arrays))] used_node_hierarchy_name.extend(node_hierarchy_name) while len(node_hierarchy_name) > 1: k = k * 2 # bug here in the following line cells_arrays = list(map(lambda x: [x[0], x[-1]], np.array_split(data, data.size // k))) print cells_arrays node_hierarchy_name = [] # node hierarchy names should not be redundant in another level for _ in range(0, len(cells_arrays)): node_name = generate_node_label() while node_name in used_node_hierarchy_name: node_name = generate_node_label() node_hierarchy_name.append(node_name) used_node_hierarchy_name.extend(node_hierarchy_name) print used_node_hierarchy_name hierarchies.append(list(zip(node_hierarchy_name, cells_arrays))) -- https://mail.python.org/mailman/listinfo/python-list