Tejesh,
If your sources are telling you AsterixDB has memtables and SStables
they are probably generative AI - which likes to make stuff up based on
what it's read - which in this case is probably too many Google and
RocksDB papers. We don't have those. Each of our LSM components is a
proper instance of the index type it's an LSM-ified variant of. For
RTrees, each component is a proper RTree. When we merge components we
do Hilbert-based sorting and create the upper non-leaf RTree structure
during the merge. Saving space has nothing to do with it - it's all
about efficiently merging N smaller RTrees into one good bigger one -
and eliminating any deleted entries ("anti-matter") along the way so
that the resulting component doesn't retain around that are no longer
current. What both sources are saying is wrong in different ways.
Please have a look at the attached paper - give it a careful read - then
have a look at the code base - that should hopefully help!
Cheers,
Mike
On 3/26/26 11:19 AM, Tejesh Sakhamuri wrote:
Hi AsterixDB community,
I am Tejesh, and I am currently working on supporting Top-k nearest
neighbor queries for GSoC 2026. During my research on how R-tree memtables
are converted to SSTables, I encountered conflicting information and would
like to clarify which approach aligns with the AsterixDB architecture.
The two sources I found state the following:
Source 1: When R-trees are converted from RAM to SSTables on disk, only
the leaf nodes are extracted, and the internal index entries are discarded
to save disk space. These leaf nodes are then sorted by Hilbert values and
stored sequentially to minimize random I/O, effectively destroying the
original R-tree structure and flattening it out. Whenever the R trees are
accessed, disk reads them sequentially
Source 2: When R-trees are converted, the index entries are preserved to
reduce the search space, despite a slight increase in storage and random
I/O. The R-tree structure remains intact, with index entries pointing to
child node page IDs. These nodes are then grouped and sorted by Hilbert
values to reduce MBR overlaps. Here, disk reading may not be sequential; it
could jump between pages of Disk memory.
Could you please clarify which of these descriptions accurately reflects
how AsterixDB handles this conversion?
I have studied and understood secondary indexing, B-trees, R-trees,
LSM-ified R-trees, page handling, and developed a few high-level ideas to
solve the problem mainly by using a global priority queue. Now I am
navigating through the codebase to see how the logic and data flow of these
data structures are actually implemented and getting myself familiar with
the operators to actually implement my idea to work with Hyracks. I would
greatly appreciate it if I could receive a reply.
Best regards,
Tejesh