Dear R-Users, I would like to extract a branch (sub-tree) from an existing tree (dendrogram) which fulfils the following conditions: - it includes a specified leaf; - it has a minimum number of leafs, but more than a specified number n;
In other words, I want to extract the n most similar leaves to a given leaf. Does anyone know some package that has this functionality? I have some working code, but it is a quick hack and not very robust. Before investing more time in it, maybe there is already such functionality. I looked through the Cluster TaskView and also explored the dendextend and ape packages (and a few more); but I did not spot such functionality. https://cran.r-project.org/web/views/Cluster.html My current code is on GitHub (see link below). An example would look like this: data(iris) irisClust = iris[,-5] d = dist(irisClust, method = "euclidean") x = hclust(d, method="ward.D") x$labels = paste0("L", 1:nrow(irisClust)) # 1 = Must contain leaf 1; # 20 = Must cover at least 20 leaves; tmp = subtree.nc(1, 20, x); plot(tmp) The function subtree.nc (and the dependencies count.nodes, subtree.nn and order.tree) are in the specified file on GitHub; the code is a little bit long for this post. All functions in the file are actually independent of other files/modules. # GitHub: https://github.com/discoleo/PeptideClassifier/blob/main/R/Helper.Tree.R There are a few pre-computed moderate-size trees also on GitHub (for more realistic exploration): https://github.com/discoleo/PeptideClassifier/tree/main/inst/examples Many thanks in advance for any useful pointers. Sincerely, Leonard [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.