Hi Dmitri,

Thanks for bringing that up!

I am not an expert with CDI, based my recent experience with CDI, while CDI
does help making development
simpler under many cases, it does also make things more complicated under
some scenarios. Especially when
injecting a request scoped object into application scoped class, it does
make things harder to reason about when
debugging, and it also seems we currently lack efficient ways of testing to
catch potential regressions. For those
cases, i think explicitly passing the objects as parameters is actually
more clear and robust.

Furthermore, I think this could make things easier for new developers for
OSS polaris.

Also, regarding the metastore object, i don't think i fully understand the
proposal, are we proposing to make the
MetastoreManager now request scoped? Currently, the metastore manager is
created per realm, not per
RealmContext, which means two requests with the same realmId can use the
same metastore manager. If we are turning
this class into a request scope, I am not sure if that would be a good
idea, especially when we are dealing with Entity Caching
optimizations, where caching per realm instead of per request probably
makes more sense.

Best Regards,
Yun



On Fri, Jun 6, 2025 at 2:47 PM Yufei Gu <flyrain...@gmail.com> wrote:

> Agreed with Alex. I think passing a realm context is a cleaner solution
> than injecting a request scope realm context. CDI provides a lot of
> benefits, but it doesn't mean that we should alway use it.
>
> In general, injecting a request-scoped bean into an application-scoped bean
> is perfectly valid in CDI, but it does come with a few gotchas:
>
>    1. Lifecycle surprises. An @ApplicationScoped method can run in contexts
>    where no request scope is active (e.g., async tasks, scheduled jobs). In
>    those cases the injected proxy can’t resolve a real instance, leading to
>    ContextNotActiveException or, worse, stale data.
>    2. Debugging pain, When failures depend on the presence or absence of a
>    request context, stack traces alone rarely tell the full story. Tracking
>    down intermittent bugs becomes tricky.
>    3. Unnecessary complexity, which is a contradiction of using injection.
>    Injection is supposed to make a simpler and cleaner code base. If you
>    really only need a bit of request data, passing it as a method parameter
>    keeps the code simpler, avoids proxy lookups, and makes unit tests
> cleaner.
>
> As a rule of thumb, we may inject a narrower scope only when we must.
>
> Yufei
>
>
> On Fri, Jun 6, 2025 at 11:15 AM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
>
> > Thanks, Alex! Very good points!
> >
> > Re: metrics, I did a quick POC by adding `@Inject RealmContext rc`
> > to QuarkusValueExpressionResolver and it appears to work. That
> > is, QuarkusValueExpressionResolver can get the realm ID from the injected
> > RealmContext as opposed to the API parameter.
> >
> > With that, we could probably change the interpretation of the
> > "realmIdentifier" expression to be the injected RealmContext, remove
> > explicit realm parameters from the generated API classes and move the
> > `@MeterTag(key="realm_id",expression="realmIdentifier")` annotation to
> the
> > method level.
> >
> > WDYT?
> >
> > Thanks,
> > Dmitri.
> >
> > On Fri, Jun 6, 2025 at 12:41 PM Alex Dutra <alex.du...@dremio.com.invalid
> >
> > wrote:
> >
> > > Hi Dmitri,
> > >
> > > Thanks for bringing up this important topic.
> > >
> > > I generally agree with your refactoring proposal, but with 2 caveats:
> > >
> > > 1) For an application-scoped bean exposing methods to retrieve
> > realm-scoped
> > > components (e.g., the "factory" beans that expose methods like
> > > "getOrCreateXYZ") I think it's actually OK to receive the RealmContext
> > as a
> > > method parameter, as opposed to having the RealmContext injected. This
> is
> > > because the injection, while possible, would only work if the request
> > > context is active at the moment where a method on that bean is invoked.
> > > This is hard to enforce at compile time and also hard to reason about
> > imho.
> > > IOW I prefer to see this:
> > >
> > > @ApplicationScoped public class XYZFactory {
> > >     public XYZ getOrCreateXYZ(RealmContext ctx) { return
> > > doSomeStuff(ctx.getRealmIdentifier()); }
> > > }
> > >
> > > Rather than this:
> > >
> > > @ApplicationScoped public class XYZFactory {
> > >     @Inject RealmContext ctx;
> > >     public XYZ getOrCreateXYZ() { return
> > > doSomeStuff(ctx.getRealmIdentifier()); }
> > > }
> > >
> > > 2) We do need a RealmContext parameter in the API classes generated
> from
> > > Mustache templates [1], because the RealmContext parameter is annotated
> > as
> > > follows:
> > >
> > > @Context @MeterTag(key="realm_id",expression="realmIdentifier")
> > > RealmContext realmContext
> > >
> > > This is so that we can produce "realm_id" tags for API metrics. Thus we
> > > cannot remove it (unless there is another way to produce such tags, but
> > I'm
> > > not aware of any).
> > >
> > > But we could remove it from the Mustache templates for *service*
> classes
> > > [2]. This would effectively remove RealmContext as a method parameter
> > from
> > > virtually all components down the call stack, thus forcing them to
> inject
> > > RealmContext wherever necessary.
> > >
> > > Thanks,
> > >
> > > Alex
> > >
> > > [1]:
> > >
> > >
> >
> https://github.com/apache/polaris/blob/1baef0edec7dfd5d6440dc1d177a713c3b29055c/server-templates/api.mustache#L96
> > > [2]:
> > >
> > >
> >
> https://github.com/apache/polaris/blob/1baef0edec7dfd5d6440dc1d177a713c3b29055c/server-templates/apiService.mustache#L39
> > >
> > > On Fri, Jun 6, 2025 at 3:49 AM Dmitri Bourlatchkov <di...@apache.org>
> > > wrote:
> > >
> > > > An example "skew" in current code: IcebergCatalogAdapter gets
> > > RealmContext
> > > > injected as a field, but also as a method parameters in
> > > createNamespace().
> > > >
> > > > On Thu, Jun 5, 2025 at 8:35 PM Dmitri Bourlatchkov <di...@apache.org
> >
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > Recently a few runtime issues related to the resolution of
> > RealmContex
> > > > got
> > > > > fixed (mostly by Yun's PRs).
> > > > >
> > > > > These fixes center around the idea of passing RealmContext as a
> > > parameter
> > > > > across various method calls.
> > > > >
> > > > > That approach is effective, however, I tend to think it goes
> against
> > > the
> > > > > grain of the dependency injection paradigm. The main point is that
> > > realm
> > > > is
> > > > > a contextual concept in Polaris REST APIs (it is not a parameter in
> > any
> > > > API
> > > > > operations). Subsequently, all the service and core code logic is
> > > > normally
> > > > > realm-neutral. That is to say that services operate without having
> to
> > > > know
> > > > > what realm they are servicing.
> > > > >
> > > > > Realm IDs become prominent only at the persistence level (to ensure
> > > > > isolation of realm data).
> > > > >
> > > > > I'd like to propose refactoring Polaris code to remove RealmContext
> > > from
> > > > > method parameters and keep it in instance fields only where it is
> > used.
> > > > >
> > > > > * MetaStore objects for REST API endpoints will be produced for a
> > > > > particular real with the request scope lifetime
> > > > > * Tasks will operate in a similar way, where PR [1817] provides a
> POC
> > > > > * The Quarkus runtime will use CDI for injection
> > > > > * The polaris-core module will remain free from CDI annotations,
> but
> > > some
> > > > > changes will probably be required to allow injection via
> > constructors.
> > > > >
> > > > > WDYT?
> > > > >
> > > > > If there are no strong objections up front, I can work on a bigger
> > POC
> > > PR
> > > > > to allow reviewing / exploring this idea in more depth.
> > > > >
> > > > > [1817] https://github.com/apache/polaris/pull/1817
> > > > >
> > > > > Cheers,
> > > > > Dmitri.
> > > > >
> > > >
> > >
> >
>

Reply via email to