Hello Dario.
Mailing list chopped attachment, but looking into debugQuery is what we
need here.

On Fri, Apr 19, 2024 at 1:41 PM <dario.v...@coop.ch> wrote:

> Hello All,
>
>
>
> We have a relatively new Solr Instance:
>
> solr-spec: 9.5.0
>
> solr-impl: 9.5.0 cdd27dd15c3a6574032e9b1b92b148ab4e383599 - gerlowskija -
> 2024-02-07 15:10:39
>
>
>
> lucene-spec: 9.9.2
>
> lucene-impl: 9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25
> 09:51:09
>
>
>
> JVM Runtime: Eclipse Adoptium OpenJDK 64-Bit Server VM 17.0.10 17.0.10+7
>
>
>
> We run the solr instance in a Kubernetes cluster in gcp.
>
>
>
> We have two collections but only documents in one of them right now. We
> have indexed ~70,000 tasks (one of the types of documents we index) on one
> of the collection. In total there are ~100,000 documents in this
> collection.
>
> Note that on production we still use an older solr version (8.11.2) with
> ~5,000,000 tasks and the fallowing problem does not appear there.
>
>
>
> The collection are all set um with the _default config and only use 1
> shard each. autoAddReplicas is also configured to be false. The
> replicationFactor is also 1. Even the maxShardsPerNode is 1.
>
> Or at least that’s how we configured  the collections. In the debugged
> response you will see that somehow multiple shards are at play.
>
>
>
> Now the problem:
>
> Every Task has a parent id – we call it processId. We use this processId
> to find all the tasks that belong to one process.
>
> By searching for this processId we expect to find all the tasks that
> belong to the corresponding process.
>
>
>
> For example, we have a process with the processId 20021454 (this is the
> real processId, I have chosen to show you the real number, because maybe
> this number is forbidden in solr?!).
>
> One would expect to find all the tasks that belong to this process when
> using this query: “task_coopProcessId:20021454”.
>
> We know for a fact that this process contains exactly four tasks. That’s
> also what solr returns – four tasks.
>
> But two of the tasks don’t belong to the correct process.
>
> Below is the response we get from solr (to keep the response short, I have
> included the fl parameter, to only show the important info for this problem
> description).
>
> I have also included the result when showing debug info as an attachment
> (example.json). You will need to mentally replace <insert-project-name>
> with a real project name, that I am not going to name here.
>
>
>
> {
>
>     "responseHeader": {
>
>         "zkConnected": true,
>
>         "status": 0,
>
>         "QTime": 9,
>
>         "params": {
>
>             "q": "task_coopProcessId:20021454",
>
>             "indent": "true",
>
>             "fl": "task_coopProcessId",
>
>             "q.op": "OR",
>
>             "useParams": ""
>
>         }
>
>     },
>
>     "response": {
>
>         "numFound": 4,
>
>         "start": 0,
>
>         "maxScore": 1,
>
>         "numFoundExact": true,
>
>         "docs": [
>
>             {
>
>                 "task_coopProcessId": 2008387
>
>             },
>
>             {
>
>                 "task_coopProcessId": 20021454
>
>             },
>
>             {
>
>                 "task_coopProcessId": 2008403
>
>             },
>
>             {
>
>                 "task_coopProcessId": 20021454
>
>             }
>
>         ]
>
>     }
>
> }
>
>
>
> With kind regards,
>
>
>
> Dario Viva
>
>
>
>
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to