Nice catch. This issue looks exactly like what I’m seeing, it returns success but does not delete the document.
SOLR-5890 Delete silently fails if not sent to shard where document was added wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 24, 2023, at 12:21 PM, Ishan Chattopadhyaya > <ichattopadhy...@gmail.com> wrote: > > Would specifying a _route_ parameter in the request work? > https://issues.apache.org/jira/browse/SOLR-6910 > I know your case is not implicit router based, but just wondering if it > still works somehow? > > > On Wed, 24 May 2023 at 23:28, Walter Underwood <wun...@wunderwood.org> > wrote: > >> Ooh, going directly to the leader node and using distrib=false, I like >> that idea. Now I need to figure out how to directly hit the danged >> Kubernetes pods. >> >> The config/deploy design here is pretty solid and aware of persistent >> storage volumes. It works fine for increasing replicas. We just need to >> avoid changing the number of shards without a reindex. One of the other >> clusters has 320 shards. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On May 24, 2023, at 10:12 AM, Gus Heck <gus.h...@gmail.com> wrote: >>> >>> Understood, of course I've seen your name on the list for a long time. >>> Partly my response is for the benefit of readers too, sorry if that >>> bothered you. You of course may have good reasons, and carefully refined >> a >>> design for your situation, that might not be best emulated everywhere. >>> Living in Kube is tricky partly because (as I understand it) it was >>> designed with stateless web stuff and microservices in mind I think and >>> it's really easy for folks administering to trip on googled advice that >> has >>> that mindset. Sounds like possibly someone in ops was thinking in terms >> of >>> pods being interchangeable, lightweight objects and not thinking about >> the >>> persistent volumes needing to line up and match the design the same way >>> every time. >>> >>> On topic: not sure, but one might need to set distrb=false or something >>> like that to avoid the routing. >>> >>> On Wed, May 24, 2023 at 12:49 PM Walter Underwood <wun...@wunderwood.org >>> >>> wrote: >>> >>>> Responses about how to avoid this are not on topic. I’ve had Solr in >>>> production since version 1.3 and I know the right way. >>>> >>>> I think I know how we got into this mess. The cluster is configured and >>>> deployed into Kubernetes. I think it was rebuilt with more shards then >> the >>>> existing storage volumes were mounted for the matching shards. New >> shards >>>> got empty volumes. Then the content was reloaded without a delete-all. >>>> >>>> Would it work to send the deletes directly to the leader for the shard? >>>> That might bypass the hash-based routing. >>>> >>>> wunder >>>> Walter Underwood >>>> wun...@wunderwood.org >>>> http://observer.wunderwood.org/ (my blog) >>>> >>>>> On May 24, 2023, at 8:35 AM, Walter Underwood <wun...@wunderwood.org> >>>> wrote: >>>>> >>>>> Clearly, they are not broadcast, or if they are, they are filtered by >>>> the hash range before executing. If they were broadcast, this problem >> would >>>> not have happened. >>>>> >>>>> Yes, we’ll delete-all and reindex at some point. This collection has >> 1.7 >>>> billion documents across 96 shards, so a full reindex is not an everyday >>>> occurrence. I’m trying to clean up the minor problem of 675k documents >> with >>>> dupes. >>>>> >>>>> wunder >>>>> Walter Underwood >>>>> wun...@wunderwood.org >>>>> http://observer.wunderwood.org/ (my blog) >>>>> >>>>>> On May 24, 2023, at 8:06 AM, Jan Høydahl <jan....@cominvent.com> >> wrote: >>>>>> >>>>>> I thought deletes were "broadcast" but probably for the composite-id >>>> router it is not since we know for sure where it resides. >>>>>> You say "shards were added" - how did you do that? >>>>>> Sounds like you shold simply re-create your collection and re-index? >>>>>> >>>>>> Jan >>>>>> >>>>>>> 24. mai 2023 kl. 16:39 skrev Walter Underwood <wun...@wunderwood.org >>> : >>>>>>> >>>>>>> We have a messed-up index with documents on shards where they >>>> shouldn’t be. Content was indexed, shards were added, then everything >> was >>>> reindexed. So the new document with the same ID was put on a new shard, >>>> leaving the previous version on the old shard (where it doesn’t match >> the >>>> hash range). >>>>>>> >>>>>>> I’m trying to delete the old document by sending an update with >>>> delete-by-id and a shards parameter. It returns success, but the >> document >>>> isn’t deleted. >>>>>>> >>>>>>> Is the hash range being checked and overriding the shards param >>>> somehow? Any ideas on how to make this work? >>>>>>> >>>>>>> And yes, we won’t do that again. >>>>>>> >>>>>>> wunder >>>>>>> Walter Underwood >>>>>>> wun...@wunderwood.org >>>>>>> http://observer.wunderwood.org/ (my blog) >>>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> -- >>> http://www.needhamsoftware.com (work) >>> http://www.the111shift.com (play) >> >>