Github user dsmiley commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/433#discussion_r212802993
--- Diff:
solr/core/src/java/org/apache/solr/update/processor/TimeRoutedAliasUpdateProcessor.java
---
@@ -230,6 +188,95 @@ public void processAdd(AddUpdateCommand cmd) throws
IOException {
}
}
+
+ private String createCollectionsIfRequired(Instant docTimestamp, String
targetCollection, String printableId) {
+ // Even though it is possible that multiple requests hit this code in
the 1-2 sec that
+ // it takes to create a collection, it's an established anti-pattern
to feed data with a very large number
+ // of client connections. This in mind, we only guard against spamming
the overseer within a batch of
+ // updates. We are intentionally tolerating a low level of redundant
requests in favor of simpler code. Most
+ // super-sized installations with many update clients will likely be
multi-tenant and multiple tenants
+ // probably don't write to the same alias. As such, we have deferred
any solution to the "many clients causing
+ // collection creation simultaneously" problem until such time as
someone actually has that problem in a
+ // real world use case that isn't just an anti-pattern.
+ try {
+ CreationType creationType = requiresCreateCollection(docTimestamp,
timeRoutedAlias.getPreemptiveCreateWindow());
+ switch (creationType) {
+ case SYNCHRONOUS:
+ // This next line blocks until all collections required by the
current document have been created
+ return maintain(targetCollection, docTimestamp, printableId,
false);
+ case ASYNC_PREEMPTIVE:
+ // Note: creating an executor and throwing it away is slightly
expensive, but this is only likely to happen
+ // once per hour/day/week (depending on time slice size for the
TRA). If the executor were retained, it
+ // would need to be shut down in a close hook to avoid test
failures due to thread leaks which is slightly
+ // more complicated from a code maintenance and readability
stand point. An executor must used instead of a
+ // thread to ensure we pick up the proper MDC logging stuff from
ExecutorUtil. T
+ if (preemptiveCreationExecutor == null) {
+ DefaultSolrThreadFactory threadFactory = new
DefaultSolrThreadFactory("TRA-preemptive-creation");
+ preemptiveCreationExecutor =
newMDCAwareSingleThreadExecutor(threadFactory);
+ preemptiveCreationExecutor.execute(() -> {
+ maintain(targetCollection, docTimestamp, printableId, true);
+ preemptiveCreationExecutor.shutdown();
+ preemptiveCreationExecutor = null;
+ });
+ }
+ return targetCollection;
+ case NONE:
+ return targetCollection; // just for clarity...
+ default:
+ return targetCollection; // could use fall through, but fall
through is fiddly for later editors.
+ }
+ // do nothing if creationType == NONE
+ } catch (SolrException e) {
+ throw e;
+ } catch (Exception e) {
+ throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, e);
+ }
+ }
+
+ /**
+ * Determine if the a new collection will be required based on the
document timestamp. Passing null for
+ * preemptiveCreateInterval tells you if the document is beyond all
existing collections with a response of
+ * {@link CreationType#NONE} or {@link CreationType#SYNCHRONOUS}, and
passing a valid date math for
+ * preemptiveCreateMath additionally distinguishes the case where the
document is close enough to the end of
+ * the TRA to trigger preemptive creation but not beyond all existing
collections with a value of
+ * {@link CreationType#ASYNC_PREEMPTIVE}.
+ *
+ * @param routeTimestamp The timestamp from the document
+ * @param preemptiveCreateMath The date math indicating the {@link
TimeRoutedAlias#preemptiveCreateMath}
+ * @return a {@code CreationType} indicating if and how to create a
collection
+ */
+ private CreationType requiresCreateCollection(Instant routeTimestamp,
String preemptiveCreateMath) {
+ final Instant mostRecentCollTimestamp =
parsedCollectionsDesc.get(0).getKey();
+ final Instant nextCollTimestamp =
timeRoutedAlias.computeNextCollTimestamp(mostRecentCollTimestamp);
+ if (!routeTimestamp.isBefore(nextCollTimestamp)) {
+ // current document is destined for a collection that doesn't exist,
must create the destination
+ // to proceed with this add command
+ return SYNCHRONOUS;
+ }
+
+ if (isBlank(preemptiveCreateMath)) {
--- End diff --
nitpick: this method might flow better if you check for _not_ blank, then
see if should return ASYNC_PREEMPTIVE, and at the very end return NONE. This
would mean exactly 3 "return" statements of each type, which reads better IMO.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]