Hello Mikhail, (resent with hopefully better representation of table)

The Fieldtype is plong. Hopefully this helps us find the problem.

As I can not send you screenshots I will attempt to send you an ascii 
representation of what I see under the Schema endpoint in the web admin view.

Field
task_coopProcessId
Type
Plong

Field: task_coopProcessId
Field-Type:org.apache.solr.schema.LongPointField

+------------+---------+--------------+------------+------------------------------------+-------------------+
|   Flags:   | Indexed | UnInvertible | Omit Norms | Omit Terms Frequencies & 
Positions | Sort Missing Last |
+------------+---------+--------------+------------+------------------------------------+-------------------+
| Properties | X       | X            | X          | X                          
        | X                 |
| Schema     | X       | X            | X          | X                          
        | X                 |
+------------+---------+--------------+------------+------------------------------------+-------------------+
 
(view Text in monospaced font, to not get confused about the table) 

Index Analyzer:
org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer:
org.apache.solr.schema.FieldType$DefaultAnalyzer
 
with kind regards,

Dario


-----Ursprüngliche Nachricht-----
Von: Mikhail Khludnev <m...@apache.org> 
Gesendet: Montag, 22. April 12024 11:34
An: users@solr.apache.org
Betreff: Re: Wrong documents in Response

>
>                 "querystring": "task_coopProcessId:20021454",
>                 "parsedquery": "(task_coopProcessId:[20021454 TO
> 20021454])",


What's the field type here? How string was parsed into range? I suppose it
may be just a StrField.

 "task_46916": {
>                                 "match": false,
>                                 "value": 0,
>                                 "description":
> "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30330"


Also, I don't know how non-matching docs may appear in the result.

On Mon, Apr 22, 2024 at 11:16?AM <dario.v...@coop.ch> wrote:

> Sure, here it is directly in mail. Hopefully it does not get chopped.
>
> {
>         "responseHeader": {
>                 "zkConnected": true,
>                 "status": 0,
>                 "QTime": 29,
>                 "params": {
>                         "q": "task_coopProcessId:20021454",
>                         "indent": "true",
>                         "fl": "task_coopProcessId",
>                         "q.op": "OR",
>                         "debug.explain.structured": "true",
>                         "debugQuery": "true",
>                         "useParams": ""
>                 }
>         },
>         "response": {
>                 <same as in response without debug>
>         },
>         "debug": {
>                 "track": {
>                         "rid":
> "<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886",
>                         "EXECUTE_QUERY": {
>                                 
> "https://<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard1_replica_n4/":
> {
>                                         "QTime": "0",
>                                         "ElapsedTime": "11",
>                                         "RequestPurpose":
> "GET_TOP_IDS,SET_TERM_STATS",
>                                         "NumFound": "0",
>                                         "Response":
> "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
> distrib=false, debug=[false, timing, track], fl=[id, score],
> shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
> rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
> debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
> omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS,
> NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false,
> useParams=}},
> response={numFound=0,numFoundExact=true,start=0,maxScore=0.0,docs=[]},
> sort_values={}, debug={timing={time=0.0, prepare={time=0.0,
> query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
> mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
> facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
> highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> terms={time=0.0}, debug={time=0.0}}}}}"
>                                 },
>                                 
> "https://<insert-project-name>-solrcloud-2.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard2_replica_n2/":
> {
>                                         "QTime": "0",
>                                         "ElapsedTime": "13",
>                                         "RequestPurpose":
> "GET_TOP_IDS,SET_TERM_STATS",
>                                         "NumFound": "0",
>                                         "Response":
> "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
> distrib=false, debug=[false, timing, track], fl=[id, score],
> shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
> rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
> debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
> omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS,
> NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false,
> useParams=}},
> response={numFound=0,numFoundExact=true,start=0,maxScore=0.0,docs=[]},
> sort_values={}, debug={timing={time=0.0, prepare={time=0.0,
> query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
> mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
> facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
> highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> terms={time=0.0}, debug={time=0.0}}}}}"
>                                 },
>                                 
> "https://<insert-project-name>-solrcloud-0.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard3_replica_n1/":
> {
>                                         "QTime": "0",
>                                         "ElapsedTime": "17",
>                                         "RequestPurpose":
> "GET_TOP_IDS,SET_TERM_STATS",
>                                         "NumFound": "4",
>                                         "Response":
> "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
> distrib=false, debug=[false, timing, track], fl=[id, score],
> shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
> rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
> debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
> omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS,
> NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false,
> useParams=}},
> response={numFound=4,numFoundExact=true,start=0,maxScore=1.0,docs=[SolrDocument{id=task_46914,
> score=1.0}, SolrDocument{id=task_46915, score=1.0},
> SolrDocument{id=task_46916, score=1.0}, SolrDocument{id=task_46917,
> score=1.0}]}, sort_values={}, debug={timing={time=0.0, prepare={time=0.0,
> query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
> mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
> facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
> highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> terms={time=0.0}, debug={time=0.0}}}}}"
>                                 }
>                         },
>                         "GET_FIELDS": {
>                                 
> "https://<insert-project-name>-solrcloud-0.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard3_replica_n1/":
> {
>                                         "QTime": "1",
>                                         "ElapsedTime": "4",
>                                         "RequestPurpose":
> "GET_FIELDS,GET_DEBUG,SET_TERM_STATS",
>                                         "NumFound": "4",
>                                         "Response":
> "{responseHeader={zkConnected=true, status=0, QTime=1, params={df=_text_,
> distrib=false, debug=[timing, track], fl=[task_coopProcessId, id],
> shards.purpose=16704, q.op=OR, rows=10,
> rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
> debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
> omitHeader=false, requestPurpose=GET_FIELDS,GET_DEBUG,SET_TERM_STATS,
> NOW=1713520858995, ids=task_46915,task_46914,task_46917,task_46916,
> isShard=true, wt=javabin, debugQuery=true, useParams=}},
> response={numFound=4,numFoundExact=true,start=0,docs=[SolrDocument{task_coopProcessId=20021454},
> SolrDocument{task_coopProcessId=2008387},
> SolrDocument{task_coopProcessId=20021454},
> SolrDocument{task_coopProcessId=2008403}]},
> debug={rawquerystring=task_coopProcessId:20021454,
> querystring=task_coopProcessId:20021454,
> parsedquery=(task_coopProcessId:[20021454 TO 20021454]),
> parsedquery_toString=task_coopProcessId:[20021454 TO 20021454],
> explain={task_46915={match=true, value=1.0,
> description=task_coopProcessId:[20021454 TO 20021454]},
> task_46914={match=false, value=0.0,
> description=task_coopProcessId:[20021454 TO 20021454] doesn't match id
> 30378}, task_46917={match=true, value=1.0,
> description=task_coopProcessId:[20021454 TO 20021454]},
> task_46916={match=false, value=0.0,
> description=task_coopProcessId:[20021454 TO 20021454] doesn't match id
> 30330}}, QParser=LuceneQParser, timing={time=1.0, prepare={time=0.0,
> query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
> mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
> facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
> highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> terms={time=0.0}, debug={time=0.0}}}}}"
>                                 }
>                         }
>                 },
>                 "timing": {
>                         "time": 1,
>                         "prepare": {
>                                 "time": 0,
>                                 "query": {
>                                         "time": 0
>                                 },
>                                 "facet": {
>                                         "time": 0
>                                 },
>                                 "facet_module": {
>                                         "time": 0
>                                 },
>                                 "mlt": {
>                                         "time": 0
>                                 },
>                                 "highlight": {
>                                         "time": 0
>                                 },
>                                 "stats": {
>                                         "time": 0
>                                 },
>                                 "expand": {
>                                         "time": 0
>                                 },
>                                 "terms": {
>                                         "time": 0
>                                 },
>                                 "debug": {
>                                         "time": 0
>                                 }
>                         },
>                         "process": {
>                                 "time": 0,
>                                 "query": {
>                                         "time": 0
>                                 },
>                                 "facet": {
>                                         "time": 0
>                                 },
>                                 "facet_module": {
>                                         "time": 0
>                                 },
>                                 "mlt": {
>                                         "time": 0
>                                 },
>                                 "highlight": {
>                                         "time": 0
>                                 },
>                                 "stats": {
>                                         "time": 0
>                                 },
>                                 "expand": {
>                                         "time": 0
>                                 },
>                                 "terms": {
>                                         "time": 0
>                                 },
>                                 "debug": {
>                                         "time": 0
>                                 }
>                         }
>                 },
>                 "rawquerystring": "task_coopProcessId:20021454",
>                 "querystring": "task_coopProcessId:20021454",
>                 "parsedquery": "(task_coopProcessId:[20021454 TO
> 20021454])",
>                 "parsedquery_toString": "task_coopProcessId:[20021454 TO
> 20021454]",
>                 "QParser": "LuceneQParser",
>                 "explain": {
>                         "task_46914": {
>                                 "match": false,
>                                 "value": 0,
>                                 "description":
> "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30378"
>                         },
>                         "task_46915": {
>                                 "match": true,
>                                 "value": 1,
>                                 "description":
> "task_coopProcessId:[20021454 TO 20021454]"
>                         },
>                         "task_46916": {
>                                 "match": false,
>                                 "value": 0,
>                                 "description":
> "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30330"
>                         },
>                         "task_46917": {
>                                 "match": true,
>                                 "value": 1,
>                                 "description":
> "task_coopProcessId:[20021454 TO 20021454]"
>                         }
>                 }
>         }
> }
>
>
> -----Ursprüngliche Nachricht-----
> Von: Mikhail Khludnev <m...@apache.org>
> Gesendet: Samstag, 20. April 12024 11:36
> An: users@solr.apache.org
> Cc: solr-u...@lucene.apache.org
> Betreff: Re: Wrong documents in Response
>
> CAUTION: This is an external email from sender 'Mikhail Khludnev <
> m...@apache.org>' ('users-return-164293-Dario.Viva=coop...@solr.apache.org').
> Do not click any links or open any attachments unless you trust the sender
> and know the content is safe.
>
>
>
> Hello Dario.
> Mailing list chopped attachment, but looking into debugQuery is what we
> need here.
>
> On Fri, Apr 19, 2024 at 1:41?PM <dario.v...@coop.ch> wrote:
>
> > Hello All,
> >
> >
> >
> > We have a relatively new Solr Instance:
> >
> > solr-spec: 9.5.0
> >
> > solr-impl: 9.5.0 cdd27dd15c3a6574032e9b1b92b148ab4e383599 - gerlowskija -
> > 2024-02-07 15:10:39
> >
> >
> >
> > lucene-spec: 9.9.2
> >
> > lucene-impl: 9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25
> > 09:51:09
> >
> >
> >
> > JVM Runtime: Eclipse Adoptium OpenJDK 64-Bit Server VM 17.0.10 17.0.10+7
> >
> >
> >
> > We run the solr instance in a Kubernetes cluster in gcp.
> >
> >
> >
> > We have two collections but only documents in one of them right now. We
> > have indexed ~70,000 tasks (one of the types of documents we index) on
> one
> > of the collection. In total there are ~100,000 documents in this
> > collection.
> >
> > Note that on production we still use an older solr version (8.11.2) with
> > ~5,000,000 tasks and the fallowing problem does not appear there.
> >
> >
> >
> > The collection are all set um with the _default config and only use 1
> > shard each. autoAddReplicas is also configured to be false. The
> > replicationFactor is also 1. Even the maxShardsPerNode is 1.
> >
> > Or at least that's how we configured  the collections. In the debugged
> > response you will see that somehow multiple shards are at play.
> >
> >
> >
> > Now the problem:
> >
> > Every Task has a parent id - we call it processId. We use this processId
> > to find all the tasks that belong to one process.
> >
> > By searching for this processId we expect to find all the tasks that
> > belong to the corresponding process.
> >
> >
> >
> > For example, we have a process with the processId 20021454 (this is the
> > real processId, I have chosen to show you the real number, because maybe
> > this number is forbidden in solr?!).
> >
> > One would expect to find all the tasks that belong to this process when
> > using this query: "task_coopProcessId:20021454".
> >
> > We know for a fact that this process contains exactly four tasks. That's
> > also what solr returns - four tasks.
> >
> > But two of the tasks don't belong to the correct process.
> >
> > Below is the response we get from solr (to keep the response short, I
> have
> > included the fl parameter, to only show the important info for this
> problem
> > description).
> >
> > I have also included the result when showing debug info as an attachment
> > (example.json). You will need to mentally replace <insert-project-name>
> > with a real project name, that I am not going to name here.
> >
> >
> >
> > {
> >
> >     "responseHeader": {
> >
> >         "zkConnected": true,
> >
> >         "status": 0,
> >
> >         "QTime": 9,
> >
> >         "params": {
> >
> >             "q": "task_coopProcessId:20021454",
> >
> >             "indent": "true",
> >
> >             "fl": "task_coopProcessId",
> >
> >             "q.op": "OR",
> >
> >             "useParams": ""
> >
> >         }
> >
> >     },
> >
> >     "response": {
> >
> >         "numFound": 4,
> >
> >         "start": 0,
> >
> >         "maxScore": 1,
> >
> >         "numFoundExact": true,
> >
> >         "docs": [
> >
> >             {
> >
> >                 "task_coopProcessId": 2008387
> >
> >             },
> >
> >             {
> >
> >                 "task_coopProcessId": 20021454
> >
> >             },
> >
> >             {
> >
> >                 "task_coopProcessId": 2008403
> >
> >             },
> >
> >             {
> >
> >                 "task_coopProcessId": 20021454
> >
> >             }
> >
> >         ]
> >
> >     }
> >
> > }
> >
> >
> >
> > With kind regards,
> >
> >
> >
> > Dario Viva
> >
> >
> >
> >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to