Thanks for the response. I created a ticket: https://issues.apache.org/jira/browse/SOLR-17274
On Fri, May 3, 2024 at 1:41 AM Christine Poerschke (BLOOMBERG/ LONDON) < cpoersc...@bloomberg.net> wrote: > Hi Calvin, > > Thanks for this very comprehensive issue report! > > Yes, a JIRA ticket for this sounds like the next logical step here then. > > And I'd be curious if perhaps we can expand the existing test coverage to > model this scenario, maybe in > https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/update/processor/AtomicUpdatesTest.java > or > https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/update/processor/AbstractAtomicUpdatesMultivalueTestBase.java > or so. > > Best wishes, > Christine > > From: users@solr.apache.org At: 05/01/24 20:47:24 UTC+1:00To: > solr-u...@lucene.apache.org > Subject: json atomic update error when using multiple modifiers (e.g., > {"id": "1", "myfield": {"add": "x", "remove": "y"}}) > > Dear Solr Users, > > I ran into a problem doing a json atomic update that tries to both `add` > and `remove` > a value for a multivalued field in a single update. I saw it initially in > an instance that runs > 9.5.0, and reproduced a minimal example using Solr 9.6.0. > > The only fields defined in the schema are: > > <field name="id" type="string" multiValued="false" indexed="true" > required="true" stored="true"/> > <field name="_version_" type="plong" indexed="false" stored="false"/> > <field name="name" type="strings" indexed="true" stored="true"/> > > `<uniqueKey>id</uniqueKey>` is also present, so I'm supplying docs with > just an `id` and a multivalued `name` field. The real setup is more > complex, > but this is a minimal test case to illustrate the problem. > > Starting with an empty index, I add the following doc to the index > successfully: > > {"id": "1", "name": ["John Doe", "Jane Doe"]} > > And I can query it, seeing the expected result: > > { > "responseHeader":{ > "status":0, > "QTime":23, > "params":{ > "q":"name:*" > } > }, > "response":{ > "numFound":1, > "start":0, > "numFoundExact":true, > "docs":[{ > "id":"1", > "name":["John Doe","Jane Doe"], > "_version_":1797873599884820480 > }] > } > } > > Next, I send an atomic update to modify the `name` field of that document > by removing > `Jane Doe` and adding `Janet Doe`: > > {"id": "1", "name": {"add": "Janet Doe", "remove": "Jane Doe"}} > > This atomic update that does both an `add` and a `remove` is something that > used to > work for us under Solr 6.6, but we just noticed that it fails in our > production 9.5 > instance and in 9.6, which I just downloaded to test against. > > The error in the solr.log indicates that Solr mistakenly interprets the > `{"add": "Janet Doe", "remove": "Jane Doe"}` as a nested document and then > throws an exception because our schema doesn't have the `_root_` field > that > would be expected if we were using nested docs (which we don't use). > > Here's the full stack trace from `solr.log` (version 9.6.0): > > 2024-05-01 17:49:02.479 ERROR (qtp2059461664-30-0.0.0.0-3) [c: s: r: > x:atris t:0.0.0.0-3] o.a.s.h.RequestHandlerBase Client exception => > org.apache.solr.common.SolrException: Unable to index docs with children: > the schema must include definitions for both a uniqueKey field and the > '_root_' field, using the exact same fieldType > at > > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuil > der.java:369) > org.apache.solr.common.SolrException: Unable to index docs with children: > the schema must include definitions for both a uniqueKey field and the > '_root_' field, using the exact same fieldType > at > > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuil > der.java:369) > ~[?:?] > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:153) > ~[?:?] > at > > org.apache.solr.update.AddUpdateCommand.makeLuceneDocs(AddUpdateCommand.java:213 > ) > ~[?:?] > at > > org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHan > dler2.java:1056) > ~[?:?] > at > > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2. > java:421) > ~[?:?] > at > > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:37 > 5) > ~[?:?] > at > > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:312 > ) > ~[?:?] > at > > org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.pr > ocessAdd(RunUpdateProcessorFactory.java:76) > ~[?:?] > at > > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequest > Processor.java:54) > ~[?:?] > at > > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(Distribut > edUpdateProcessor.java:270) > ~[?:?] > at > > org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(Distrib > utedUpdateProcessor.java:533) > ~[?:?] > at > > org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0( > DistributedUpdateProcessor.java:358) > ~[?:?] > at > org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:43) > ~[?:?] > at > > org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Distribut > edUpdateProcessor.java:355) > ~[?:?] > at > > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(Distribut > edUpdateProcessor.java:236) > ~[?:?] > at > > org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.pr > ocessAdd(LogUpdateProcessorFactory.java:111) > ~[?:?] > at > > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(Js > onLoader.java:553) > ~[?:?] > at > > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate > (JsonLoader.java:183) > ~[?:?] > at > > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoad > er.java:151) > ~[?:?] > at > org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:86) ~[?:?] > at > > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:10 > 2) > ~[?:?] > at > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStream > HandlerBase.java:100) > ~[?:?] > at > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java > :226) > ~[?:?] > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2886) ~[?:?] > at > > org.apache.solr.servlet.HttpSolrCall.executeCoreRequest(HttpSolrCall.java:910) > ~[?:?] > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:596) > ~[?:?] > at > > org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:262) > ~[?:?] > at > > org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter. > java:219) > ~[?:?] > at > > org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.jav > a:249) > ~[?:?] > at > > org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:215) > ~[?:?] > at > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:213) > ~[?:?] > at > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) > ~[?:?] > at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210) > ~[jetty-servlet-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635 > ) > ~[jetty-servlet-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527) > ~[jetty-servlet-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:598) > ~[jetty-security-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223 > ) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:158 > 0) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221 > ) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:138 > 4) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484) > ~[jetty-servlet-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1553 > ) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1306 > ) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerC > ollection.java:149) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java > :228) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java > :141) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:301) > ~[jetty-rewrite-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.20.jar:10.0.20] > at org.eclipse.jetty.server.Server.handle(Server.java:563) > ~[jetty-server-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.ja > va:1598) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.io > .AbstractConnection$ReadCallback.succeeded(AbstractConnectio > n.java:314) > ~[jetty-io-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) > ~[jetty-io-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.io > .SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.j > ava:53) > ~[jetty-io-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(Adaptiv > eExecutionStrategy.java:421) > ~[jetty-util-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(Ada > ptiveExecutionStrategy.java:390) > ~[jetty-util-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(Adap > tiveExecutionStrategy.java:277) > ~[jetty-util-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(Adaptiv > eExecutionStrategy.java:193) > ~[jetty-util-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) > ~[jetty-util-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool. > java:1194) > ~[jetty-util-10.0.20.jar:10.0.20] > at > > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java: > 1149) > ~[jetty-util-10.0.20.jar:10.0.20] > at java.base/java.lang.Thread.run(Thread.java:829) [?:?] > > In case it matters, the update handler was called with `commitWithin=1000`. > > I believe this is happening because as part of commit 5a9a34daedb3 on > 2021-06-12 > ("SOLR-15044: JSON Loading: nested docs don't need ID"), the `isChildDoc` > method > of solr/core/src/java/org/apache/solr/handler/loader/JsonLoader.java was > changed: > > + /** Is this a child doc (true) or a partial update (false)? */ > private boolean isChildDoc(SolrInputDocument extendedFieldValue) { > - return > > extendedFieldValue.containsKey(req.getSchema().getUniqueKeyField().getName()); > + if (extendedFieldValue.size() != 1) { > + return true; > + } > + // if the only key is a field in the schema, assume it's a child doc > + final String fieldName = > extendedFieldValue.iterator().next().getName(); > + return req.getSchema().getFieldOrNull(fieldName) != null; > + // otherwise, assume it's "set" or some other verb for a partial > update. > + // NOTE: it's fundamentally ambiguous with JSON; this is a best > effort try. > } > > In case that diff gets reformatted and garbled, it would previously have > checked > if the `id` field was a key in the `extendedFieldValue`, which would have > been `false`, but it now returns `true` if the `extendedFieldValue.size()` > is not > equal to `1`, which is the case here because I have both an `add` and a > `remove` > in the extended field value. It does work fine if I submit two separate > updates, one for each of the two operations. > > For now, we're going to do the `add` and `remove` in separate > updates and make sure we don't have other cases of multiple > update operations, but it would be nice if users weren't forced to do that. > > Let me know if I should create a Jira issue for this, or if any > other information would be helpful. > > This was actually the second time we were affected by the `isChildDoc` > change I mentioned above. The first issue, when we upgraded from 6.6 to 9.5 > last year, > we worked around by renaming an existing field from `set` to `sset`. The > problem > was that an atomic update like `{"id": "1", "name": {"set": ["J Doe"]}}` > would fail with the same error message because since `set` was a field > name, > `return req.getSchema().getFieldOrNull(fieldName) != null` would evaluate > to `true` > for `isChildDoc`. After we renamed the field to `sset`, then > `req.getSchema().getFieldOrNull(fieldName)` would be null for `set` and so > `isChildDoc` would evaluate to `false`, as desired. > > Thanks, > Calvin > > >