[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)

Erich Keane via cfe-commits Thu, 06 Mar 2025 12:21:46 -0800

erichkeane wrote:

> Ah sorry I forgot to reply to your review.
> 
> > We might take a look at `SetVector` and see if there are any learnings we 
> > can take from that. It has a set and a vector together, but might have some 
> > interesting things.
> 
> I did take a look at this actually, but it looked equivalent to what I was 
> doing before, particularly with respect to storing the expensive keys 
> (`DynTypedNode`) twice. It just uses `SmallVector<T, ...>` and `DenseSet<T>` 
> underneath, whereas I just happened to use `SmallDenseSet<T, ...>` instead of 
> `DenseSet<T>`.
> 
> > Finally... I wonder if we were to just store this as a sorted-vector of 
> > `DynTypedNode` + `insert order`, then do a copy & sort on access of `view`.
> 
> I don't think that works. You don't want to do a ton of work on every 
> `view()` -- it's assumed to be cheap, and called often. Also, maintaining 
> sort order gets expensive the more nodes you add; you don't want to do it 
> every insertion.
>


I see.  Yes, so the problem is that I don't have a good idea here of the uses, 
I did a first run here.  If `view` is the 'frequent' operation, then my 
suggestion isn't a good one at least :) 

> Furthermore, not only is it that elements without memoization data aren't 
> comparable, but they still need to appear in the `view` _in the same order as 
> they were inserted_. This means either they need to be in the same vector as 
> the comparable elements (which makes it impossible to sort the comparable 
> portion), or we need to create a façade over two completely disjoint 
> containers.
> 
> The implementation in this PR addresses all of those constraints without any 
> drawbacks, I think. There's less code compared to writing a facade, and 
> performance-wise it feels just about optimal.
> 
> > Can we do some benchmarking to figure out which things are done the 'most 
> > often' with what values of `N`? it might better guide this.
> 
> Re: benchmarking -- note that I did a general benchmark both with the test 
> case that was provided in the associated issue, as well as with an expensive 
> TU that we had internally. It had great performance in both cases. Regarding 
> N -- I assume N is referring to the total container sizes (per the other 
> comment)? The thing is, I don't get the impression there's much room for 
> improvement here. The memory usage was just high by a constant (10x) factor. 
> Some ~5x of that almost certainly came from the data type itself (storing 
> `DynTypedNode` in the cache which is 40 bytes, instead of `DynTypedNode*`, 
> which is 8 bytes). Adding laziness on top of that has shrunk the difference 
> so much that the performance plots are already very close to coinciding.

Ok, I think I'm ok with all that then.  BUT as I said, I'm not the best person 
to be reviewing this, I've pinged Aaron and hope he'll come along.

https://github.com/llvm/llvm-project/pull/129934
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)

Reply via email to