capistrant commented on code in PR #19374:
URL: https://github.com/apache/druid/pull/19374#discussion_r3163355943


##########
server/src/test/java/org/apache/druid/server/coordinator/rules/WildcardProjectionPartialLoadMatcherTest.java:
##########


Review Comment:
   claude observation - told claude to look at the surface area covered by the 
different tests since it is hard to keep straight just scrolling and it offered 
a few ideas for further coverage. just food for thought if you want to expand, 
I don't see it as blocking
   
   
   - Mid-string and suffix wildcards. Every match test uses * or ? at the end 
(user_*, p?, session_d*). A pattern like *_daily or user_*_v2 would catch a 
class of bugs that the current tests can't    
     (e.g. if someone someday "optimizes" the prefix case). Cheap to add, real 
signal.
   - No-match-returns-null when the pattern simply doesn't hit anything. 
testExcludeAllMatchedReturnsNull covers the all-excluded path, and 
testReturnsNullForProjectionAgnosticSegment covers the      
     no-projections path. There's no test for "segment has projections, none 
match patterns" — same observable behavior, but a different code path through 
resolveProjectionNames.                        
   - Multiple exclude patterns. Every exclude test uses a single-element 
excludePatterns list. List.of("user_*", "session_temp_*") against a segment 
with names hitting both would pin down that the
     exclude list is OR'd, not AND'd.                                           
                                                                                
                                          
   - Case sensitivity. Implicit in the implementation but never asserted. A 
one-liner — User_* vs user_daily — locks the contract so a future 
Pattern.CASE_INSENSITIVE change would break a test instead
      of silently shifting behavior. 



##########
server/src/main/java/org/apache/druid/server/coordinator/rules/ProjectionPartialLoadMatcher.java:
##########
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.rules;
+
+import com.google.common.hash.Hasher;
+import com.google.common.hash.Hashing;
+import com.google.common.io.BaseEncoding;
+import org.apache.druid.timeline.DataSegment;
+
+import javax.annotation.Nullable;
+import java.nio.charset.StandardCharsets;
+import java.util.List;
+import java.util.Locale;
+import java.util.Map;
+
+/**
+ * Base for {@link PartialLoadMatcher} implementations that decide which of a 
segment's V10 projections to load.
+ * Subclasses supply the resolution policy via {@link 
#resolveProjectionNames(DataSegment)}; this base handles
+ * fingerprint computation and wraps the result into the {@code 
partialProjection} load-spec wire form consumed
+ * by the historical-side {@code PartialProjectionLoadSpec} (which does not 
exist yet, supplied in future work).
+ * <p>
+ * The fingerprint is a hash of what projections are partially loaded on a 
segment by this rule; the data node will
+ * include this value in the segment announcement so that it can be used as a 
lightweight value to compare against
+ * to handle things like rule change so that we can ensure that the 'right' 
partial load is in place from run to run.
+ */
+public abstract class ProjectionPartialLoadMatcher implements 
PartialLoadMatcher
+{
+  static final String WIRE_TYPE = "partialProjection";
+  static final String FINGERPRINT_VERSION = "v1";
+
+  /**
+   * Returns the sorted, deduped list of projection names from {@link 
DataSegment#getProjections()} that this matcher
+   * selects. Returns an empty list when nothing matches (the segment exposes 
no projections, or no configured pattern
+   * intersects what the segment has).
+   */
+  protected abstract List<String> resolveProjectionNames(DataSegment segment);
+
+  @Override
+  @Nullable
+  public MatchResult match(DataSegment segment, Map<String, Object> 
baseLoadSpec)
+  {
+    final List<String> resolved = resolveProjectionNames(segment);
+    if (resolved.isEmpty()) {
+      return null;
+    }
+    final String fingerprint = computeFingerprint(resolved);
+    final Map<String, Object> wrapped = Map.of(
+        "type", WIRE_TYPE,
+        "delegate", baseLoadSpec,
+        "projections", resolved,
+        "ruleFingerprint", fingerprint
+    );
+    return new MatchResult(wrapped, fingerprint);
+  }
+
+  static String computeFingerprint(List<String> sortedDedupedNames)
+  {
+    final Hasher hasher = Hashing.sha256().newHasher();
+    for (String name : sortedDedupedNames) {
+      hasher.putString(name, StandardCharsets.UTF_8);

Review Comment:
   bump



##########
server/src/main/java/org/apache/druid/server/coordinator/rules/WildcardProjectionPartialLoadMatcher.java:
##########
@@ -0,0 +1,218 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.rules;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonInclude;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.druid.error.InvalidInput;
+import org.apache.druid.timeline.DataSegment;
+
+import javax.annotation.Nullable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Objects;
+import java.util.TreeSet;
+import java.util.regex.Pattern;
+
+/**
+ * Selects projections whose names match any of the configured glob patterns, 
minus any names matching an entry in
+ * {@code excludePatterns}. Supported glob metacharacters:

Review Comment:
   any consideration to reserving `[` and `]`  in case of desire for supporting 
character classes in the future?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to