Thanks for the write up Mick. I think its is a great evaluation of 15452. A
few notes below:

* CI links for 15452 might be burried and I may need to link the most
recent run (I’ve been using CircleCI since it’s what I’m familiar with —
happy to have runs on ASf hardware as well).

* 15452 is configurable by YAML property as you mentioned. I’m traveling
but I’m 99% sure I made it a hot property as well (personally I believe you
shouldn’t have to restart a node to disable something like this, so if I
forgot to make it hot I will before merge). I’m traveling without my laptop
so I will double check next week (if someone has time to before then it’s
much appreciated)

 * I think “safe-ish” is a fair description of 15452. The new code is
fairly isolated but it’s in a very critical path with several call paths
leading to it. I think the testing we’ve done (both manual and automated)
as well as finding and fixing several bugs with those tests is what gives
me the confidence. And as you mentioned we have the fail safe of turning it
off otherwise.

Jordan

On Fri, Feb 14, 2025 at 12:42 Mick Semb Wever <m...@apache.org> wrote:

> Solid write up Jon!
>
> Hoping the committers and PMC members are keeping in mind this (very)
> recent thread:
> https://lists.apache.org/thread/h38g6q9d8h1q92h6qzs5tqdxpn2vmnyy
>
> This thread needs to also be about evaluating the risk these commits
> are to a patch version.
> I'm +1 and here's my thinking over it…
>
> Are the performance benefits clear ?
> Yes (thank you Jon for doing a solid job at demonstrating just how
> awesome this will be).
>
> Do we want this in 5.0 ?
> That's the dumb question :) of course.
>
> Does it fix a bug or a regression, or is an operational improvement ?
> Kinda, Branimir did mention a performance regression when the chunk
> cache was disabled.
>
> How are the patches with unforeseen risks ?
> These patches are medium sized and low-level, though isolated to (and
> creating a better isolation of) the compaction (scan) reader.  15452
> looks safe-ish.  I can't speak to 20092, it would be nice to hear
> Branimir and Sylvain chime in.  Are we confident that if any bugs do
> arise in user's 5.0 production deployments we will be quick to provide
> fixes and releases?  I was thinking it's worth putting in a system
> property that can disable the scan reader, so if anything happens
> folks can just restart the node with the system property to return to
> pre-patch behaviour, but that's already there! :)
>
> How well tested have they been ?
> We have one (or a few) reports of it running in production, that's
> super positive.  Reports of lots of manual testing, super positive
> too.  But there's no CI results available for either patch.  (Even the
> one committed to trunk doesn't have pre-commit CI results available.)
> I think CI results attached to the ticket are a must – I'm working on
> adding them.
>

Reply via email to