capistrant commented on code in PR #19374:
URL: https://github.com/apache/druid/pull/19374#discussion_r3163355943
##########
server/src/test/java/org/apache/druid/server/coordinator/rules/WildcardProjectionPartialLoadMatcherTest.java:
##########
Review Comment:
claude observation - told claude to look at the surface area covered by the
different tests since it is hard to keep straight just scrolling and it offered
a few ideas for further coverage. just food for thought if you want to expand,
I don't see it as blocking
- Mid-string and suffix wildcards. Every match test uses * or ? at the end
(user_*, p?, session_d*). A pattern like *_daily or user_*_v2 would catch a
class of bugs that the current tests can't
(e.g. if someone someday "optimizes" the prefix case). Cheap to add, real
signal.
- No-match-returns-null when the pattern simply doesn't hit anything.
testExcludeAllMatchedReturnsNull covers the all-excluded path, and
testReturnsNullForProjectionAgnosticSegment covers the
no-projections path. There's no test for "segment has projections, none
match patterns" — same observable behavior, but a different code path through
resolveProjectionNames.
- Multiple exclude patterns. Every exclude test uses a single-element
excludePatterns list. List.of("user_*", "session_temp_*") against a segment
with names hitting both would pin down that the
exclude list is OR'd, not AND'd.
- Case sensitivity. Implicit in the implementation but never asserted. A
one-liner — User_* vs user_daily — locks the contract so a future
Pattern.CASE_INSENSITIVE change would break a test instead
of silently shifting behavior.
##########
server/src/main/java/org/apache/druid/server/coordinator/rules/ProjectionPartialLoadMatcher.java:
##########
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.rules;
+
+import com.google.common.hash.Hasher;
+import com.google.common.hash.Hashing;
+import com.google.common.io.BaseEncoding;
+import org.apache.druid.timeline.DataSegment;
+
+import javax.annotation.Nullable;
+import java.nio.charset.StandardCharsets;
+import java.util.List;
+import java.util.Locale;
+import java.util.Map;
+
+/**
+ * Base for {@link PartialLoadMatcher} implementations that decide which of a
segment's V10 projections to load.
+ * Subclasses supply the resolution policy via {@link
#resolveProjectionNames(DataSegment)}; this base handles
+ * fingerprint computation and wraps the result into the {@code
partialProjection} load-spec wire form consumed
+ * by the historical-side {@code PartialProjectionLoadSpec} (which does not
exist yet, supplied in future work).
+ * <p>
+ * The fingerprint is a hash of what projections are partially loaded on a
segment by this rule; the data node will
+ * include this value in the segment announcement so that it can be used as a
lightweight value to compare against
+ * to handle things like rule change so that we can ensure that the 'right'
partial load is in place from run to run.
+ */
+public abstract class ProjectionPartialLoadMatcher implements
PartialLoadMatcher
+{
+ static final String WIRE_TYPE = "partialProjection";
+ static final String FINGERPRINT_VERSION = "v1";
+
+ /**
+ * Returns the sorted, deduped list of projection names from {@link
DataSegment#getProjections()} that this matcher
+ * selects. Returns an empty list when nothing matches (the segment exposes
no projections, or no configured pattern
+ * intersects what the segment has).
+ */
+ protected abstract List<String> resolveProjectionNames(DataSegment segment);
+
+ @Override
+ @Nullable
+ public MatchResult match(DataSegment segment, Map<String, Object>
baseLoadSpec)
+ {
+ final List<String> resolved = resolveProjectionNames(segment);
+ if (resolved.isEmpty()) {
+ return null;
+ }
+ final String fingerprint = computeFingerprint(resolved);
+ final Map<String, Object> wrapped = Map.of(
+ "type", WIRE_TYPE,
+ "delegate", baseLoadSpec,
+ "projections", resolved,
+ "ruleFingerprint", fingerprint
+ );
+ return new MatchResult(wrapped, fingerprint);
+ }
+
+ static String computeFingerprint(List<String> sortedDedupedNames)
+ {
+ final Hasher hasher = Hashing.sha256().newHasher();
+ for (String name : sortedDedupedNames) {
+ hasher.putString(name, StandardCharsets.UTF_8);
Review Comment:
bump
##########
server/src/main/java/org/apache/druid/server/coordinator/rules/WildcardProjectionPartialLoadMatcher.java:
##########
@@ -0,0 +1,218 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.rules;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonInclude;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.druid.error.InvalidInput;
+import org.apache.druid.timeline.DataSegment;
+
+import javax.annotation.Nullable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Objects;
+import java.util.TreeSet;
+import java.util.regex.Pattern;
+
+/**
+ * Selects projections whose names match any of the configured glob patterns,
minus any names matching an entry in
+ * {@code excludePatterns}. Supported glob metacharacters:
Review Comment:
any consideration to reserving `[` and `]` in case of desire for supporting
character classes in the future?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]