Hi Calvin, Thanks for this very comprehensive issue report!
Yes, a JIRA ticket for this sounds like the next logical step here then. And I'd be curious if perhaps we can expand the existing test coverage to model this scenario, maybe in https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/update/processor/AtomicUpdatesTest.java or https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/update/processor/AbstractAtomicUpdatesMultivalueTestBase.java or so. Best wishes, Christine From: users@solr.apache.org At: 05/01/24 20:47:24 UTC+1:00To: solr-u...@lucene.apache.org Subject: json atomic update error when using multiple modifiers (e.g., {"id": "1", "myfield": {"add": "x", "remove": "y"}}) Dear Solr Users, I ran into a problem doing a json atomic update that tries to both `add` and `remove` a value for a multivalued field in a single update. I saw it initially in an instance that runs 9.5.0, and reproduced a minimal example using Solr 9.6.0. The only fields defined in the schema are: <field name="id" type="string" multiValued="false" indexed="true" required="true" stored="true"/> <field name="_version_" type="plong" indexed="false" stored="false"/> <field name="name" type="strings" indexed="true" stored="true"/> `<uniqueKey>id</uniqueKey>` is also present, so I'm supplying docs with just an `id` and a multivalued `name` field. The real setup is more complex, but this is a minimal test case to illustrate the problem. Starting with an empty index, I add the following doc to the index successfully: {"id": "1", "name": ["John Doe", "Jane Doe"]} And I can query it, seeing the expected result: { "responseHeader":{ "status":0, "QTime":23, "params":{ "q":"name:*" } }, "response":{ "numFound":1, "start":0, "numFoundExact":true, "docs":[{ "id":"1", "name":["John Doe","Jane Doe"], "_version_":1797873599884820480 }] } } Next, I send an atomic update to modify the `name` field of that document by removing `Jane Doe` and adding `Janet Doe`: {"id": "1", "name": {"add": "Janet Doe", "remove": "Jane Doe"}} This atomic update that does both an `add` and a `remove` is something that used to work for us under Solr 6.6, but we just noticed that it fails in our production 9.5 instance and in 9.6, which I just downloaded to test against. The error in the solr.log indicates that Solr mistakenly interprets the `{"add": "Janet Doe", "remove": "Jane Doe"}` as a nested document and then throws an exception because our schema doesn't have the `_root_` field that would be expected if we were using nested docs (which we don't use). Here's the full stack trace from `solr.log` (version 9.6.0): 2024-05-01 17:49:02.479 ERROR (qtp2059461664-30-0.0.0.0-3) [c: s: r: x:atris t:0.0.0.0-3] o.a.s.h.RequestHandlerBase Client exception => org.apache.solr.common.SolrException: Unable to index docs with children: the schema must include definitions for both a uniqueKey field and the '_root_' field, using the exact same fieldType at org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuil der.java:369) org.apache.solr.common.SolrException: Unable to index docs with children: the schema must include definitions for both a uniqueKey field and the '_root_' field, using the exact same fieldType at org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuil der.java:369) ~[?:?] at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:153) ~[?:?] at org.apache.solr.update.AddUpdateCommand.makeLuceneDocs(AddUpdateCommand.java:213 ) ~[?:?] at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHan dler2.java:1056) ~[?:?] at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2. java:421) ~[?:?] at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:37 5) ~[?:?] at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:312 ) ~[?:?] at org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.pr ocessAdd(RunUpdateProcessorFactory.java:76) ~[?:?] at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequest Processor.java:54) ~[?:?] at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(Distribut edUpdateProcessor.java:270) ~[?:?] at org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(Distrib utedUpdateProcessor.java:533) ~[?:?] at org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0( DistributedUpdateProcessor.java:358) ~[?:?] at org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:43) ~[?:?] at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Distribut edUpdateProcessor.java:355) ~[?:?] at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(Distribut edUpdateProcessor.java:236) ~[?:?] at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.pr ocessAdd(LogUpdateProcessorFactory.java:111) ~[?:?] at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(Js onLoader.java:553) ~[?:?] at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate (JsonLoader.java:183) ~[?:?] at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoad er.java:151) ~[?:?] at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:86) ~[?:?] at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:10 2) ~[?:?] at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStream HandlerBase.java:100) ~[?:?] at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java :226) ~[?:?] at org.apache.solr.core.SolrCore.execute(SolrCore.java:2886) ~[?:?] at org.apache.solr.servlet.HttpSolrCall.executeCoreRequest(HttpSolrCall.java:910) ~[?:?] at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:596) ~[?:?] at org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:262) ~[?:?] at org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter. java:219) ~[?:?] at org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.jav a:249) ~[?:?] at org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:215) ~[?:?] at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:213) ~[?:?] at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) ~[?:?] at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210) ~[jetty-servlet-10.0.20.jar:10.0.20] at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635 ) ~[jetty-servlet-10.0.20.jar:10.0.20] at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527) ~[jetty-servlet-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:598) ~[jetty-security-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223 ) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:158 0) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221 ) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:138 4) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484) ~[jetty-servlet-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1553 ) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1306 ) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerC ollection.java:149) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java :228) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java :141) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:301) ~[jetty-rewrite-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.Server.handle(Server.java:563) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.ja va:1598) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287) ~[jetty-server-10.0.20.jar:10.0.20] at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnectio n.java:314) ~[jetty-io-10.0.20.jar:10.0.20] at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) ~[jetty-io-10.0.20.jar:10.0.20] at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.j ava:53) ~[jetty-io-10.0.20.jar:10.0.20] at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(Adaptiv eExecutionStrategy.java:421) ~[jetty-util-10.0.20.jar:10.0.20] at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(Ada ptiveExecutionStrategy.java:390) ~[jetty-util-10.0.20.jar:10.0.20] at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(Adap tiveExecutionStrategy.java:277) ~[jetty-util-10.0.20.jar:10.0.20] at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(Adaptiv eExecutionStrategy.java:193) ~[jetty-util-10.0.20.jar:10.0.20] at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) ~[jetty-util-10.0.20.jar:10.0.20] at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool. java:1194) ~[jetty-util-10.0.20.jar:10.0.20] at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java: 1149) ~[jetty-util-10.0.20.jar:10.0.20] at java.base/java.lang.Thread.run(Thread.java:829) [?:?] In case it matters, the update handler was called with `commitWithin=1000`. I believe this is happening because as part of commit 5a9a34daedb3 on 2021-06-12 ("SOLR-15044: JSON Loading: nested docs don't need ID"), the `isChildDoc` method of solr/core/src/java/org/apache/solr/handler/loader/JsonLoader.java was changed: + /** Is this a child doc (true) or a partial update (false)? */ private boolean isChildDoc(SolrInputDocument extendedFieldValue) { - return extendedFieldValue.containsKey(req.getSchema().getUniqueKeyField().getName()); + if (extendedFieldValue.size() != 1) { + return true; + } + // if the only key is a field in the schema, assume it's a child doc + final String fieldName = extendedFieldValue.iterator().next().getName(); + return req.getSchema().getFieldOrNull(fieldName) != null; + // otherwise, assume it's "set" or some other verb for a partial update. + // NOTE: it's fundamentally ambiguous with JSON; this is a best effort try. } In case that diff gets reformatted and garbled, it would previously have checked if the `id` field was a key in the `extendedFieldValue`, which would have been `false`, but it now returns `true` if the `extendedFieldValue.size()` is not equal to `1`, which is the case here because I have both an `add` and a `remove` in the extended field value. It does work fine if I submit two separate updates, one for each of the two operations. For now, we're going to do the `add` and `remove` in separate updates and make sure we don't have other cases of multiple update operations, but it would be nice if users weren't forced to do that. Let me know if I should create a Jira issue for this, or if any other information would be helpful. This was actually the second time we were affected by the `isChildDoc` change I mentioned above. The first issue, when we upgraded from 6.6 to 9.5 last year, we worked around by renaming an existing field from `set` to `sset`. The problem was that an atomic update like `{"id": "1", "name": {"set": ["J Doe"]}}` would fail with the same error message because since `set` was a field name, `return req.getSchema().getFieldOrNull(fieldName) != null` would evaluate to `true` for `isChildDoc`. After we renamed the field to `sset`, then `req.getSchema().getFieldOrNull(fieldName)` would be null for `set` and so `isChildDoc` would evaluate to `false`, as desired. Thanks, Calvin