Hi,

this is the first time I'm writing to this list, so hi to all :-)

I'm having problems querying text having special characters inside (see
https://solr.apache.org/guide/solr/latest/query-guide/standard-query-parser.html#escaping-special-charaters).

My setup:
Solr 9.6.1 running as a standalone server system under Java 21 on an
internal Linux VM (Ubuntu 24.04).

For testing purposes I created a new core "test" and uploaded a few
sample documents to it:


{
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "q":"*:*",
      "indent":"true",
      "q.op":"OR",
      "useParams":"",
      "_":"1723547755451"
    }
  },
  "response":{
    "numFound":7,
    "start":0,
    "numFoundExact":true,
    "docs":[{
      "id":"70",
      "resourcename":"beispiel.txt",
      "content_type":["text/plain; charset=windows-1252"],
      "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n \n
Suchtext:\r\nab_dc\r\n \n  "],
      "_version_":1801822834550374400
    },{
      "id":"71",
      "resourcename":"beispiel2.txt",
      "content_type":["text/plain; charset=windows-1252"],
      "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n \n
Suchtext:\r\nab-dc\r\n \n  "],
      "_version_":1801823062283255808
    },{
      "id":"72",
      "resourcename":"beispiel3.txt",
      "content_type":["text/plain; charset=windows-1252"],
      "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n \n  Dies ist ein
langer Suchtext:\r\nab-de\r\ndef+hi\r\nkl-nop\r\n \n  "],
      "_version_":1806915982686420992
    },{
      "id":"73",
      "resourcename":"beispiel4.txt",
      "content_type":["text/plain; charset=windows-1252"],
      "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n \n
ab-cd\r\nde-fg\r\n \n  "],
      "_version_":1806917322172006400
    },{
      "id":"74",
      "resourcename":"beispiel2-1.txt",
      "content_type":["text/plain; charset=windows-1252"],
      "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n \n  Dies ist ein
langer Suchtext:\r\nabedc\r\ndef+ghi\r\n \n  "],
      "_version_":1807270704395059200
    },{
      "id":"75",
      "resourcename":"beispiel2-2.txt",
      "content_type":["text/plain; charset=windows-1252"],
      "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n \n  Dies ist ein
langer Suchtext:\r\nab-dc\r\ndefxghi\r\n \n  "],
      "_version_":1807270722296348672
    },{
      "id":"76",
      "resourcename":"beispiel2-3.txt",
      "content_type":["text/plain; charset=windows-1252"],
      "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n \n  Dies ist ein
langer Suchtext:\r\nabedc\r\ndefxghi\r\n \n  "],
      "_version_":1807270740219658240
    }]
  }
}


The problem is that I haven't found out how to correctly search for
documents with a "-" in it by using wildcards (* and ?). Some queries
seem to work while others don't...

The query itself is basically the same:

q=...&q.op=AND&fl=id,resourcename&sort=id+asc&start=0&rows=2147483647

and differs only in the value of "q".

My queries:

q: *uchtex*
=> ok, 6 documents found (#70, #71, #72, #74, #75, #76)

q: uchtex*
=> ok, 0 documents found

q: Suchtex*
=> ok, 6 documents found (#70, #71, #72, #74, #75, #76)

q: b?d
=> ok, 0 documents found

q: b?d*
=> ok, 0 documents found

q: *b-d*
=> ok, 0 documents found (because "-" isn't quoted, right?)

q: *b?d*
=> not ok, only 3 documents found: #70, #74, #76
=> missing:  #71, #72, #75

q: *b*d*
=> not ok, only 3 documents found: #70, #74, #76
=> (all 7 expected)

q: ?b?d?
=> not ok, only 3 documents found: #70, #74, #76
=> missing:  #71, #72, #75

q: ab*
=> ok, all 7 documents found

q: ab*d
=> not ok, 0 documents found
=> missing: #73

q: ab??d
=> not ok, 0 documents found
=> missing: #73

q: ab\-dc
=> ok, 2 documents found: #71, #75

q: ab\-d*
=> not ok, 0 documents found
=> missing: #71, #72, #75

q: ab?d*
=> not ok, 3 documents found: #70, #74, #76
=> missing: #71, #72, #75

q: *b\-d*
=> not ok, 0 documents found
=> missing: #71, #72, #75

q: *b\\-d*
=> 0


Can someone enlighten me what I'm doing wrong? Am I missing something?
Or do I misunderstand something?


Regards

Thorsten


Reply via email to