Dear Bert,

Thank you very much for your suggestion.

1. Some Clarifications
Let me add first some clarification. Finding the n-closest items/leafs does not 
seem a weird idea. I thought that someone has implemented it already.

2. Package TreeTools
I may have looked over it as well (I peaked over a number of packages).

I took now another thorough look at it:

2.a) Function Subtree requires the node where to extract the subtree.
I do not know this node; the function may be still useful, but only to extract 
the subtree.

2.b.) Other functions
Subsplit and the various versions of split don't seem to be useful for my 
problems. Also, Decompose & CollapseNode seem to do something else.

I did look carefully over some of the functions, as I need various other 
functionalities as well - some of which are not yet implemented in my code. I 
focused the first message on 1 functionality.

In the meantime, I have fixed a bug in my own extraction code. But I am still 
looking for implementations in other packages.

However, searching for specific functionalities has become a nightmare, as the 
variances of function names within packages and between packages reach almost 
infinity. And the words subtree and split are far too common in these packages.

Sincerely,

Leonard
________________________________
From: Bert Gunter <bgunter.4...@gmail.com>
Sent: Wednesday, August 20, 2025 12:52 AM
To: Leo Mada <leo.m...@syonic.eu>
Subject: Re: [R] Extracting a Sub-Tree from a Dendrogram (based on some 
criteria)

I gave your exact specification above prefixed with "In R", viz.,
"In R, I would like to extract a branch (sub-tree) from an existing tree 
(dendrogram) which fulfils the following conditions:
- it includes a specified leaf;
- it has a minimum number of leafs, but more than a specified number n; "
to an internet search engine, and got back a number of possibly useful hits, 
mostly, it seemed, from the TreeTools package, but not all.
If you haven't already tried this, you might wish to do so. If you have, 
apologies for telling you something you already know.

Cheers,
Bert


On Tue, Aug 19, 2025 at 2:07 PM Leo Mada via R-help 
<r-help@r-project.org<mailto:r-help@r-project.org>> wrote:
Dear R-Users,

I would like to extract a branch (sub-tree) from an existing tree (dendrogram) 
which fulfils the following conditions:
- it includes a specified leaf;
- it has a minimum number of leafs, but more than a specified number n;

In other words, I want to extract the n most similar leaves to a given leaf.

Does anyone know some package that has this functionality?

I have some working code, but it is a quick hack and not very robust. Before 
investing more time in it, maybe there is already such functionality.

I looked through the Cluster TaskView and also explored the dendextend and ape 
packages (and a few more); but I did not spot such functionality.
https://cran.r-project.org/web/views/Cluster.html

My current code is on GitHub (see link below).

An example would look like this:

data(iris)

irisClust = iris[,-5]
d = dist(irisClust, method = "euclidean")
x = hclust(d, method="ward.D")
x$labels = paste0("L", 1:nrow(irisClust))

# 1  = Must contain leaf 1;
# 20 = Must cover at least 20 leaves;
tmp = subtree.nc<http://subtree.nc>(1, 20, x);
plot(tmp)

The function subtree.nc<http://subtree.nc> (and the dependencies count.nodes, 
subtree.nn and order.tree) are in the specified file on GitHub; the code is a 
little bit long for this post. All functions in the file are actually 
independent of other files/modules.

# GitHub:
https://github.com/discoleo/PeptideClassifier/blob/main/R/Helper.Tree.R

There are a few pre-computed moderate-size trees also on GitHub (for more 
realistic exploration):
https://github.com/discoleo/PeptideClassifier/tree/main/inst/examples

Many thanks in advance for any useful pointers.

Sincerely,

Leonard

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to