FrankChen021 commented on code in PR #19548:
URL: https://github.com/apache/druid/pull/19548#discussion_r3413392514


##########
server/src/main/java/org/apache/druid/segment/join/LookupJoinableFactory.java:
##########
@@ -57,9 +61,20 @@ public Optional<Joinable> build(final DataSource dataSource, 
final JoinCondition
     if (condition.canHashJoin()) {
       final String lookupName = lookupDataSource.getLookupName();
       return lookupProvider.get(lookupName)
-                           .map(c -> 
LookupJoinable.wrap(c.getLookupExtractorFactory().get()));
+                           .map(this::buildLookupJoinable);
     } else {
       return Optional.empty();
     }
   }
+
+  private LookupJoinable buildLookupJoinable(final 
LookupExtractorFactoryContainer container)
+  {
+    final LookupExtractorFactory lookupExtractorFactory = 
container.getLookupExtractorFactory();
+    final Optional<RetainedLookupExtractor> retainedLookupExtractor =
+        lookupExtractorFactory.acquireRetainedLookupExtractor();

Review Comment:
   Thanks, this addresses the joinable lifecycle issue I was worried about. The 
retained lookup is now owned by the query-scoped SegmentMapFunction, and the 
server, realtime, and MSQ callers I checked close that function after the 
mapped segments/segment references, so the join path no longer appears to rely 
on the Cleaner for normal completion.
   
   Reviewed 32 of 32 changed files.
   
   ---
   
   This is an automated review by Codex GPT-5.5



##########
processing/src/main/java/org/apache/druid/query/dimension/LookupDimensionSpec.java:
##########
@@ -136,18 +139,51 @@ public String getName()
   @Override
   public ExtractionFn getExtractionFn()
   {
-    final LookupExtractor lookupExtractor;
+    final LookupExtractor lookupExtractor = getLookupExtractor();
+    return makeLookupExtractionFn(lookupExtractor);
+  }
+
+  @Override
+  public ExtractionFn getExtractionFnForMetadata()
+  {
+    return makeLookupExtractionFn(getLookupExtractorForMetadata());
+  }
+
+  private LookupExtractor getLookupExtractor()
+  {
+    if (Strings.isNullOrEmpty(name)) {
+      return this.lookup;
+    }
+
+    final LookupExtractorFactory lookupExtractorFactory = 
getLookupExtractorFactory();
 
+    final Optional<RetainedLookupExtractor> retainedLookupExtractor =
+        lookupExtractorFactory.acquireRetainedLookupExtractor();
+
+    // ExtractionFn has no close hook. The RetainedLookupExtractor cleaner 
releases this reference when the
+    // LookupExtractionFn that owns it becomes unreachable.
+    return retainedLookupExtractor.<LookupExtractor>map(retained -> 
retained).orElseGet(lookupExtractorFactory);

Review Comment:
   [P1] Do not retain lookup dimension extraction functions without a close path
   
   getExtractionFn() now acquires a RetainedLookupExtractor and returns it 
inside a plain LookupExtractionFn, but that API has no close hook. Several 
callers do not cache or close the returned function, for example 
TopNQueryQueryToolChest calls 
topNQuery.getDimensionSpec().getExtractionFn().apply(...) while post-processing 
each result value, and GroupByQueryQueryToolChest.extractionsToRewrite calls it 
just to inspect getExtractionType(). For named lookup dimensions, each call 
increments the namespace cache reference and then depends on Cleaner/GC to 
release it, so with the new default maxRetiredCacheEntries=1 a query can make 
later namespace refreshes skip until GC or the 15 minute timeout. Please avoid 
retaining from LookupDimensionSpec.getExtractionFn() unless the returned handle 
has a deterministic query/result lifecycle, or update these call sites to use a 
scoped retained wrapper that is explicitly closed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to