Points are somewhat specific thing. Why don't start from StrField ?

On Mon, Apr 22, 2024 at 6:02 PM <dario.v...@coop.ch> wrote:

> Hello Mikhail, (resent with hopefully better representation of table)
>
> The Fieldtype is plong. Hopefully this helps us find the problem.
>
> As I can not send you screenshots I will attempt to send you an ascii
> representation of what I see under the Schema endpoint in the web admin
> view.
>
> Field
> task_coopProcessId
> Type
> Plong
>
> Field: task_coopProcessId
> Field-Type:org.apache.solr.schema.LongPointField
>
>
> +------------+---------+--------------+------------+------------------------------------+-------------------+
> |   Flags:   | Indexed | UnInvertible | Omit Norms | Omit Terms
> Frequencies & Positions | Sort Missing Last |
>
> +------------+---------+--------------+------------+------------------------------------+-------------------+
> | Properties | X       | X            | X          | X
>               | X                 |
> | Schema     | X       | X            | X          | X
>               | X                 |
> +------------+---------+--------------+------------+------------------------------------+-------------------+
>
> (view Text in monospaced font, to not get confused about the table)
>
> Index Analyzer:
> org.apache.solr.schema.FieldType$DefaultAnalyzer
> Query Analyzer:
> org.apache.solr.schema.FieldType$DefaultAnalyzer
>
> with kind regards,
>
> Dario
>
>
> -----Ursprüngliche Nachricht-----
> Von: Mikhail Khludnev <m...@apache.org>
> Gesendet: Montag, 22. April 12024 11:34
> An: users@solr.apache.org
> Betreff: Re: Wrong documents in Response
>
> >
> >                 "querystring": "task_coopProcessId:20021454",
> >                 "parsedquery": "(task_coopProcessId:[20021454 TO
> > 20021454])",
>
>
> What's the field type here? How string was parsed into range? I suppose it
> may be just a StrField.
>
>  "task_46916": {
> >                                 "match": false,
> >                                 "value": 0,
> >                                 "description":
> > "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30330"
>
>
> Also, I don't know how non-matching docs may appear in the result.
>
> On Mon, Apr 22, 2024 at 11:16?AM <dario.v...@coop.ch> wrote:
>
> > Sure, here it is directly in mail. Hopefully it does not get chopped.
> >
> > {
> >         "responseHeader": {
> >                 "zkConnected": true,
> >                 "status": 0,
> >                 "QTime": 29,
> >                 "params": {
> >                         "q": "task_coopProcessId:20021454",
> >                         "indent": "true",
> >                         "fl": "task_coopProcessId",
> >                         "q.op": "OR",
> >                         "debug.explain.structured": "true",
> >                         "debugQuery": "true",
> >                         "useParams": ""
> >                 }
> >         },
> >         "response": {
> >                 <same as in response without debug>
> >         },
> >         "debug": {
> >                 "track": {
> >                         "rid":
> >
> "<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886",
> >                         "EXECUTE_QUERY": {
> >                                 "https://
> <insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard1_replica_n4/":
> > {
> >                                         "QTime": "0",
> >                                         "ElapsedTime": "11",
> >                                         "RequestPurpose":
> > "GET_TOP_IDS,SET_TERM_STATS",
> >                                         "NumFound": "0",
> >                                         "Response":
> > "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
> > distrib=false, debug=[false, timing, track], fl=[id, score],
> > shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
> >
> rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
> > debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
> > omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS,
> > NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false,
> > useParams=}},
> > response={numFound=0,numFoundExact=true,start=0,maxScore=0.0,docs=[]},
> > sort_values={}, debug={timing={time=0.0, prepare={time=0.0,
> > query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
> > mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0},
> expand={time=0.0},
> > terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
> > facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
> > highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> > terms={time=0.0}, debug={time=0.0}}}}}"
> >                                 },
> >                                 "https://
> <insert-project-name>-solrcloud-2.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard2_replica_n2/":
> > {
> >                                         "QTime": "0",
> >                                         "ElapsedTime": "13",
> >                                         "RequestPurpose":
> > "GET_TOP_IDS,SET_TERM_STATS",
> >                                         "NumFound": "0",
> >                                         "Response":
> > "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
> > distrib=false, debug=[false, timing, track], fl=[id, score],
> > shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
> >
> rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
> > debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
> > omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS,
> > NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false,
> > useParams=}},
> > response={numFound=0,numFoundExact=true,start=0,maxScore=0.0,docs=[]},
> > sort_values={}, debug={timing={time=0.0, prepare={time=0.0,
> > query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
> > mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0},
> expand={time=0.0},
> > terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
> > facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
> > highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> > terms={time=0.0}, debug={time=0.0}}}}}"
> >                                 },
> >                                 "https://
> <insert-project-name>-solrcloud-0.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard3_replica_n1/":
> > {
> >                                         "QTime": "0",
> >                                         "ElapsedTime": "17",
> >                                         "RequestPurpose":
> > "GET_TOP_IDS,SET_TERM_STATS",
> >                                         "NumFound": "4",
> >                                         "Response":
> > "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
> > distrib=false, debug=[false, timing, track], fl=[id, score],
> > shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
> >
> rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
> > debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
> > omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS,
> > NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false,
> > useParams=}},
> >
> response={numFound=4,numFoundExact=true,start=0,maxScore=1.0,docs=[SolrDocument{id=task_46914,
> > score=1.0}, SolrDocument{id=task_46915, score=1.0},
> > SolrDocument{id=task_46916, score=1.0}, SolrDocument{id=task_46917,
> > score=1.0}]}, sort_values={}, debug={timing={time=0.0, prepare={time=0.0,
> > query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
> > mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0},
> expand={time=0.0},
> > terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
> > facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
> > highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> > terms={time=0.0}, debug={time=0.0}}}}}"
> >                                 }
> >                         },
> >                         "GET_FIELDS": {
> >                                 "https://
> <insert-project-name>-solrcloud-0.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard3_replica_n1/":
> > {
> >                                         "QTime": "1",
> >                                         "ElapsedTime": "4",
> >                                         "RequestPurpose":
> > "GET_FIELDS,GET_DEBUG,SET_TERM_STATS",
> >                                         "NumFound": "4",
> >                                         "Response":
> > "{responseHeader={zkConnected=true, status=0, QTime=1, params={df=_text_,
> > distrib=false, debug=[timing, track], fl=[task_coopProcessId, id],
> > shards.purpose=16704, q.op=OR, rows=10,
> >
> rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
> > debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
> > omitHeader=false, requestPurpose=GET_FIELDS,GET_DEBUG,SET_TERM_STATS,
> > NOW=1713520858995, ids=task_46915,task_46914,task_46917,task_46916,
> > isShard=true, wt=javabin, debugQuery=true, useParams=}},
> >
> response={numFound=4,numFoundExact=true,start=0,docs=[SolrDocument{task_coopProcessId=20021454},
> > SolrDocument{task_coopProcessId=2008387},
> > SolrDocument{task_coopProcessId=20021454},
> > SolrDocument{task_coopProcessId=2008403}]},
> > debug={rawquerystring=task_coopProcessId:20021454,
> > querystring=task_coopProcessId:20021454,
> > parsedquery=(task_coopProcessId:[20021454 TO 20021454]),
> > parsedquery_toString=task_coopProcessId:[20021454 TO 20021454],
> > explain={task_46915={match=true, value=1.0,
> > description=task_coopProcessId:[20021454 TO 20021454]},
> > task_46914={match=false, value=0.0,
> > description=task_coopProcessId:[20021454 TO 20021454] doesn't match id
> > 30378}, task_46917={match=true, value=1.0,
> > description=task_coopProcessId:[20021454 TO 20021454]},
> > task_46916={match=false, value=0.0,
> > description=task_coopProcessId:[20021454 TO 20021454] doesn't match id
> > 30330}}, QParser=LuceneQParser, timing={time=1.0, prepare={time=0.0,
> > query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
> > mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0},
> expand={time=0.0},
> > terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
> > facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
> > highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
> > terms={time=0.0}, debug={time=0.0}}}}}"
> >                                 }
> >                         }
> >                 },
> >                 "timing": {
> >                         "time": 1,
> >                         "prepare": {
> >                                 "time": 0,
> >                                 "query": {
> >                                         "time": 0
> >                                 },
> >                                 "facet": {
> >                                         "time": 0
> >                                 },
> >                                 "facet_module": {
> >                                         "time": 0
> >                                 },
> >                                 "mlt": {
> >                                         "time": 0
> >                                 },
> >                                 "highlight": {
> >                                         "time": 0
> >                                 },
> >                                 "stats": {
> >                                         "time": 0
> >                                 },
> >                                 "expand": {
> >                                         "time": 0
> >                                 },
> >                                 "terms": {
> >                                         "time": 0
> >                                 },
> >                                 "debug": {
> >                                         "time": 0
> >                                 }
> >                         },
> >                         "process": {
> >                                 "time": 0,
> >                                 "query": {
> >                                         "time": 0
> >                                 },
> >                                 "facet": {
> >                                         "time": 0
> >                                 },
> >                                 "facet_module": {
> >                                         "time": 0
> >                                 },
> >                                 "mlt": {
> >                                         "time": 0
> >                                 },
> >                                 "highlight": {
> >                                         "time": 0
> >                                 },
> >                                 "stats": {
> >                                         "time": 0
> >                                 },
> >                                 "expand": {
> >                                         "time": 0
> >                                 },
> >                                 "terms": {
> >                                         "time": 0
> >                                 },
> >                                 "debug": {
> >                                         "time": 0
> >                                 }
> >                         }
> >                 },
> >                 "rawquerystring": "task_coopProcessId:20021454",
> >                 "querystring": "task_coopProcessId:20021454",
> >                 "parsedquery": "(task_coopProcessId:[20021454 TO
> > 20021454])",
> >                 "parsedquery_toString": "task_coopProcessId:[20021454 TO
> > 20021454]",
> >                 "QParser": "LuceneQParser",
> >                 "explain": {
> >                         "task_46914": {
> >                                 "match": false,
> >                                 "value": 0,
> >                                 "description":
> > "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30378"
> >                         },
> >                         "task_46915": {
> >                                 "match": true,
> >                                 "value": 1,
> >                                 "description":
> > "task_coopProcessId:[20021454 TO 20021454]"
> >                         },
> >                         "task_46916": {
> >                                 "match": false,
> >                                 "value": 0,
> >                                 "description":
> > "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30330"
> >                         },
> >                         "task_46917": {
> >                                 "match": true,
> >                                 "value": 1,
> >                                 "description":
> > "task_coopProcessId:[20021454 TO 20021454]"
> >                         }
> >                 }
> >         }
> > }
> >
> >
> > -----Ursprüngliche Nachricht-----
> > Von: Mikhail Khludnev <m...@apache.org>
> > Gesendet: Samstag, 20. April 12024 11:36
> > An: users@solr.apache.org
> > Cc: solr-u...@lucene.apache.org
> > Betreff: Re: Wrong documents in Response
> >
> > CAUTION: This is an external email from sender 'Mikhail Khludnev <
> > m...@apache.org>' ('users-return-164293-Dario.Viva=
> coop...@solr.apache.org').
> > Do not click any links or open any attachments unless you trust the
> sender
> > and know the content is safe.
> >
> >
> >
> > Hello Dario.
> > Mailing list chopped attachment, but looking into debugQuery is what we
> > need here.
> >
> > On Fri, Apr 19, 2024 at 1:41?PM <dario.v...@coop.ch> wrote:
> >
> > > Hello All,
> > >
> > >
> > >
> > > We have a relatively new Solr Instance:
> > >
> > > solr-spec: 9.5.0
> > >
> > > solr-impl: 9.5.0 cdd27dd15c3a6574032e9b1b92b148ab4e383599 -
> gerlowskija -
> > > 2024-02-07 15:10:39
> > >
> > >
> > >
> > > lucene-spec: 9.9.2
> > >
> > > lucene-impl: 9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c -
> 2024-01-25
> > > 09:51:09
> > >
> > >
> > >
> > > JVM Runtime: Eclipse Adoptium OpenJDK 64-Bit Server VM 17.0.10
> 17.0.10+7
> > >
> > >
> > >
> > > We run the solr instance in a Kubernetes cluster in gcp.
> > >
> > >
> > >
> > > We have two collections but only documents in one of them right now. We
> > > have indexed ~70,000 tasks (one of the types of documents we index) on
> > one
> > > of the collection. In total there are ~100,000 documents in this
> > > collection.
> > >
> > > Note that on production we still use an older solr version (8.11.2)
> with
> > > ~5,000,000 tasks and the fallowing problem does not appear there.
> > >
> > >
> > >
> > > The collection are all set um with the _default config and only use 1
> > > shard each. autoAddReplicas is also configured to be false. The
> > > replicationFactor is also 1. Even the maxShardsPerNode is 1.
> > >
> > > Or at least that's how we configured  the collections. In the debugged
> > > response you will see that somehow multiple shards are at play.
> > >
> > >
> > >
> > > Now the problem:
> > >
> > > Every Task has a parent id - we call it processId. We use this
> processId
> > > to find all the tasks that belong to one process.
> > >
> > > By searching for this processId we expect to find all the tasks that
> > > belong to the corresponding process.
> > >
> > >
> > >
> > > For example, we have a process with the processId 20021454 (this is the
> > > real processId, I have chosen to show you the real number, because
> maybe
> > > this number is forbidden in solr?!).
> > >
> > > One would expect to find all the tasks that belong to this process when
> > > using this query: "task_coopProcessId:20021454".
> > >
> > > We know for a fact that this process contains exactly four tasks.
> That's
> > > also what solr returns - four tasks.
> > >
> > > But two of the tasks don't belong to the correct process.
> > >
> > > Below is the response we get from solr (to keep the response short, I
> > have
> > > included the fl parameter, to only show the important info for this
> > problem
> > > description).
> > >
> > > I have also included the result when showing debug info as an
> attachment
> > > (example.json). You will need to mentally replace <insert-project-name>
> > > with a real project name, that I am not going to name here.
> > >
> > >
> > >
> > > {
> > >
> > >     "responseHeader": {
> > >
> > >         "zkConnected": true,
> > >
> > >         "status": 0,
> > >
> > >         "QTime": 9,
> > >
> > >         "params": {
> > >
> > >             "q": "task_coopProcessId:20021454",
> > >
> > >             "indent": "true",
> > >
> > >             "fl": "task_coopProcessId",
> > >
> > >             "q.op": "OR",
> > >
> > >             "useParams": ""
> > >
> > >         }
> > >
> > >     },
> > >
> > >     "response": {
> > >
> > >         "numFound": 4,
> > >
> > >         "start": 0,
> > >
> > >         "maxScore": 1,
> > >
> > >         "numFoundExact": true,
> > >
> > >         "docs": [
> > >
> > >             {
> > >
> > >                 "task_coopProcessId": 2008387
> > >
> > >             },
> > >
> > >             {
> > >
> > >                 "task_coopProcessId": 20021454
> > >
> > >             },
> > >
> > >             {
> > >
> > >                 "task_coopProcessId": 2008403
> > >
> > >             },
> > >
> > >             {
> > >
> > >                 "task_coopProcessId": 20021454
> > >
> > >             }
> > >
> > >         ]
> > >
> > >     }
> > >
> > > }
> > >
> > >
> > >
> > > With kind regards,
> > >
> > >
> > >
> > > Dario Viva
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to