Re: [pool] Resilience against factory outages (POOL-407)

Gary Gregory Fri, 31 May 2024 11:53:24 -0700

Hi Phil,

Thank you for the note. I'll try to take a look soon.


The new code causes the build to fail as it looks like not all of it is
covered by unit tests.

Gary


On Fri, May 31, 2024, 2:29 PM Phil Steitz <[email protected]> wrote:

> I just committed  a first attempt at providing the above, intended as a fix
> for POOL-407 and a lot of similar issues reported over the years.  The
> scenario in POOL-407 is common when resource providers (like databases) go
> down:
>
> 1. makeObject requests start to fail and threads line up waiting on the
> deque.
> 2. The provider comes back up so makes will succeed again, but the clients,
> the pool and the factory are all ignorant of this fact, so no clients get
> served.
>
> What I just committed puts the resilience responsibility on the factory,
> having it monitor itself.  That responsibility could arguably be put
> instead on the pool.
>
> To use the feature as is, you need to create a ResilientPooledObjectFactory
> wrapping a PooledObjectFactory, configure it, attach it to its pool and
> start its monitor.  The formerly disabled GOP test,
> testLivenessOnTransientFactoryFailure, shows how to do it.  The setup is a
> little awkward.  I would appreciate feedback on the following options for
> how to improve it (or any other comments on the code):
>
> 0) Roll it back and come up with something better
> 1) Leave as is
> 2) add a GOP config that results in its factory being wrapped automatically
> in a RPOF.
> 3) move the functionality into the pool
>
> The other thing that needs to be designed is how to make the proactive make
> attempt strategy configurable.  It is hard-coded now in the RPOF runChecks
> and the Adder inner class.  The initial implementation is primitive:
> Monitor the makeObject log.  Any failure triggers start of an Adder that
> tries addObject with configurable delay and (hard-coded) max failures.
> Once the circular log becomes filled with successes, turn the adder off.
>
> Also, RPOF spawns a monitoring thread and, when it detects a transient
> failure, an adder thread.  Careful review - and improvement - of the
> management of these threads would be appreciated.  I tried to make sure,
> and added tests to confirm, that closing the pool kills these threads.
>
> Phil
>

Re: [pool] Resilience against factory outages (POOL-407)

Reply via email to