Re: Help needed testing new systemd script (SOLR-14410)

2023-05-24 Thread Shawn Heisey
On 5/23/2023 3:12 AM, Jan Høydahl wrote: We have an excellent contribution in https://issues.apache.org/jira/browse/SOLR-14410 and https://github.com/apache/solr/pull/428 to switch to systemd init script for Solr. Before we can merge we need help testing it on more Unix flavours. Do you have

Deleting document on wrong shard?

2023-05-24 Thread Walter Underwood
We have a messed-up index with documents on shards where they shouldn’t be. Content was indexed, shards were added, then everything was reindexed. So the new document with the same ID was put on a new shard, leaving the previous version on the old shard (where it doesn’t match the hash range).

Re: Deleting document on wrong shard?

2023-05-24 Thread Jan Høydahl
I thought deletes were "broadcast" but probably for the composite-id router it is not since we know for sure where it resides. You say "shards were added" - how did you do that? Sounds like you shold simply re-create your collection and re-index? Jan > 24. mai 2023 kl. 16:39 skrev Walter Underwo

Re: Deleting document on wrong shard?

2023-05-24 Thread Walter Underwood
Clearly, they are not broadcast, or if they are, they are filtered by the hash range before executing. If they were broadcast, this problem would not have happened. Yes, we’ll delete-all and reindex at some point. This collection has 1.7 billion documents across 96 shards, so a full reindex is

Re: Deleting document on wrong shard?

2023-05-24 Thread Gus Heck
Often it's a better idea to index into a fresh collection when making changes that imply a full re-index. If you use an alias, the swap out of the old collection is atomic when you update the alias, requiring no front end changes at all (and swap back is easy if things aren't what you expected). Of

Re: Deleting document on wrong shard?

2023-05-24 Thread Walter Underwood
Responses about how to avoid this are not on topic. I’ve had Solr in production since version 1.3 and I know the right way. I think I know how we got into this mess. The cluster is configured and deployed into Kubernetes. I think it was rebuilt with more shards then the existing storage volumes

Re: Deleting document on wrong shard?

2023-05-24 Thread Gus Heck
Understood, of course I've seen your name on the list for a long time. Partly my response is for the benefit of readers too, sorry if that bothered you. You of course may have good reasons, and carefully refined a design for your situation, that might not be best emulated everywhere. Living in Kube

Re: Deleting document on wrong shard?

2023-05-24 Thread Walter Underwood
Ooh, going directly to the leader node and using distrib=false, I like that idea. Now I need to figure out how to directly hit the danged Kubernetes pods. The config/deploy design here is pretty solid and aware of persistent storage volumes. It works fine for increasing replicas. We just need to

Re: Deleting document on wrong shard?

2023-05-24 Thread Ishan Chattopadhyaya
Would specifying a _route_ parameter in the request work? https://issues.apache.org/jira/browse/SOLR-6910 I know your case is not implicit router based, but just wondering if it still works somehow? On Wed, 24 May 2023 at 23:28, Walter Underwood wrote: > Ooh, going directly to the leader node a

Re: Deleting document on wrong shard?

2023-05-24 Thread Walter Underwood
Nice catch. This issue looks exactly like what I’m seeing, it returns success but does not delete the document. SOLR-5890 Delete silently fails if not sent to shard where document was added wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 24, 202

Re: Deleting document on wrong shard?

2023-05-24 Thread Ishan Chattopadhyaya
Ah, now I remember this comment: https://issues.apache.org/jira/browse/SOLR-5890?focusedCommentId=14294129&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14294129 "Updated the patch, now with the Hash based router also honouring the _ *route*_ param." On Thu, 25 M

Re: Deleting document on wrong shard?

2023-05-24 Thread Walter Underwood
It works! Thanks so much. I’m using XML update format because the JSON format for sending multiple IDs for deletion is not documented anywhere I could find. It was easier to just generate XML instead of continuing to search for documentation. This does the trick: datalake_FPD_163298_3RGR-V090-

Re: Deleting document on wrong shard?

2023-05-24 Thread Shawn Heisey
On 5/24/23 10:48, Walter Underwood wrote: I think I know how we got into this mess. The cluster is configured and deployed into Kubernetes. I think it was rebuilt with more shards then the existing storage volumes were mounted for the matching shards. New shards got empty volumes. Then the con

Re: Deleting document on wrong shard?

2023-05-24 Thread Walter Underwood
Yes, I know it doesn’t work. It creates an index that violates some basic invariants, like having one ID map to one document. It does weird things, like return one document but list two documents in the facet counts with different values for the same single-valued field. I’m trying to patch it