[ https://issues.apache.org/jira/browse/SOLR-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844440#comment-17844440 ]
Calvin Smith commented on SOLR-17274: ------------------------------------- It would be nice if we didn't have to switch to a non-Json format. We're a Python backend that uses JSON everywhere and XML nowhere currently, and last time I checked, CBOR didn't support something that we use via JSON, so the path of least resistance for us would be to work around the JSON limitations somehow. In terms of possible changes though, could Solr take into account whether a `_root_` field is defined for the index/schema, and if not, then don't consider it a child doc? Or alternately, maybe a configuration option to enable or disable nested documents that don't provide a `uniqueKey` field, and it could behave differently in that case (e.g., old logic if nested docs without a `uniqueKey` are not enabled and new logic or something like it if so). For folks like us that don't ever use nested docs (with or without a `uniqueKey`), the old logic worked perfectly. > atomic update error when using json w/ multiple modifiers > --------------------------------------------------------- > > Key: SOLR-17274 > URL: https://issues.apache.org/jira/browse/SOLR-17274 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update > Affects Versions: 9.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6 > Environment: Observed first in 9.5.0 w/ OpenJDK 11, then reproduced > this simple test case in 9.6.0. > Reporter: Calvin Smith > Priority: Major > > I ran into a problem doing a json atomic update that tries to both `add` and > `remove` a value for a multivalued field in a single update. I saw it > initially in an instance that runs 9.5.0, and reproduced a minimal example > using Solr 9.6.0. > {{The only fields defined in the schema are:}} > {code:java} > <field name="id" type="string" multiValued="false" indexed="true" > required="true" stored="true"/> > <field name="_version_" type="plong" indexed="false" stored="false"/> > <field name="name" type="strings" indexed="true" stored="true"/>{code} > {{`<uniqueKey>id</uniqueKey>` is also present, so I'm supplying docs with > just an `id` and a multivalued `name` field. The real setup is more complex, > but this is a minimal test case to illustrate the problem.}} > {{Starting with an empty index, I add the following doc to the index > successfully:}} > {code:java} > {"id": "1", "name": ["John Doe", "Jane Doe"]}{code} > {{And I can query it, seeing the expected result:}} > {code:java} > { > "responseHeader":{ > "status":0, > "QTime":23, > "params":{ > "q":"name:*" > } > }, > "response":{ > "numFound":1, > "start":0, > "numFoundExact":true, > "docs":[{ > "id":"1", > "name":["John Doe","Jane Doe"], > "_version_":1797873599884820480 > }] > } > }{code} > {{Next, I send an atomic update to modify the `name` field of that document > by removing `Jane Doe` and adding `Janet Doe`:}} > {code:java} > {"id": "1", "name": {"add": "Janet Doe", "remove": "Jane Doe"}}{code} > {{This atomic update that does both an `add` and a `remove` is something that > used to work for us under Solr 6.6, but we just noticed that it fails in our > production 9.5 instance and in 9.6, which I just downloaded to test against.}} > {{The error in the solr.log indicates that Solr mistakenly interprets the > `\{"add": "Janet Doe", "remove": "Jane Doe"}` as a nested document and then > throws an exception because our schema doesn't have the `{_}root{_}` field > that would be expected if we were using nested docs (which we don't use).}} > {{Here's the full stack trace from `solr.log` (version 9.6.0):}} > {noformat} > 2024-05-01 17:49:02.479 ERROR (qtp2059461664-30-0.0.0.0-3) [c: s: r: x:atris > t:0.0.0.0-3] o.a.s.h.RequestHandlerBase Client exception => > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > ~[?:?] > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:153) > ~[?:?] > at > org.apache.solr.update.AddUpdateCommand.makeLuceneDocs(AddUpdateCommand.java:213) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1056) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:421) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:375) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:312) > ~[?:?] > at > org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76) > ~[?:?] > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:54) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:270) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:533) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0(DistributedUpdateProcessor.java:358) > ~[?:?] > at > org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:43) ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:355) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:236) > ~[?:?] > at > org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:111) > ~[?:?] > at > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:553) > ~[?:?] > at > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:183) > ~[?:?] > at > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:151) > ~[?:?] > at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:86) > ~[?:?] > at > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:102) > ~[?:?] > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:100) > ~[?:?] > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) > ~[?:?] > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2886) ~[?:?] > at > org.apache.solr.servlet.HttpSolrCall.executeCoreRequest(HttpSolrCall.java:910) > ~[?:?] > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:596) > ~[?:?] > at > org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:262) > ~[?:?] > at > org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:219) > ~[?:?] > at > org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:249) > ~[?:?] > at > org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:215) > ~[?:?] > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:213) > ~[?:?] > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) > ~[?:?] > at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210) > ~[jetty-servlet-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635) > ~[jetty-servlet-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527) > ~[jetty-servlet-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:598) > ~[jetty-security-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1580) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1384) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484) > ~[jetty-servlet-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1553) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1306) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:228) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:141) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:301) > ~[jetty-rewrite-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.20.jar:10.0.20] > at org.eclipse.jetty.server.Server.handle(Server.java:563) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753) > ~[jetty-server-10.0.20.jar:10.0.20] > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287) > ~[jetty-server-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314) > ~[jetty-io-10.0.20.jar:10.0.20] > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) > ~[jetty-io-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) > ~[jetty-io-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421) > ~[jetty-util-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390) > ~[jetty-util-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277) > ~[jetty-util-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(AdaptiveExecutionStrategy.java:193) > ~[jetty-util-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) > ~[jetty-util-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194) > ~[jetty-util-10.0.20.jar:10.0.20] > at > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149) > ~[jetty-util-10.0.20.jar:10.0.20] > at java.base/java.lang.Thread.run(Thread.java:829) [?:?]{noformat} > > In case it matters, the update handler was called with `commitWithin=1000`. > > I believe this is happening because as part of commit 5a9a34daedb3 on > 2021-06-12 ("SOLR-15044: JSON Loading: nested docs don't need ID"), the > `isChildDoc` method of > solr/core/src/java/org/apache/solr/handler/loader/JsonLoader.java was changed: > {code:java} > + /** Is this a child doc (true) or a partial update (false)? */ > private boolean isChildDoc(SolrInputDocument extendedFieldValue) { > - return > extendedFieldValue.containsKey(req.getSchema().getUniqueKeyField().getName()); > + if (extendedFieldValue.size() != 1) { > + return true; > + } > + // if the only key is a field in the schema, assume it's a child doc > + final String fieldName = > extendedFieldValue.iterator().next().getName(); > + return req.getSchema().getFieldOrNull(fieldName) != null; > + // otherwise, assume it's "set" or some other verb for a partial > update. > + // NOTE: it's fundamentally ambiguous with JSON; this is a best effort > try. > }{code} > In case that diff gets reformatted and garbled, it would previously have > checked if the `id` field was a key in the `extendedFieldValue`, which > would have been `false`, but it now returns `true` if the > `extendedFieldValue.size()` is not equal to `1`, which is the case here > because I have both an `add` and a `remove` in the extended field value. It > does work fine if I submit two separate updates, one for each of the two > operations. > For now, we're going to do the `add` and `remove` in separate updates and > make sure we don't have other cases of multiple update operations, but it > would be nice if users weren't forced to do that. > This was actually the second time we were affected by the `isChildDoc`change > I mentioned above. The first issue, when we upgraded from 6.6 to 9.5 last > year, we worked around by renaming an existing field from `set` to `sset`. > The problem was that an atomic update like `{"id": "1", "name": {"set": ["J > Doe"]}}` would fail with the same error message because since `set` was a > field name, `return req.getSchema().getFieldOrNull(fieldName) != null` would > evaluate to `true` for `isChildDoc`. After we renamed the field to `sset`, > then `req.getSchema().getFieldOrNull(fieldName)` would be null for `set` and > so > `isChildDoc` would evaluate to `false`, as desired. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org