Hi,

we're running Dovecot 2.1.7 together with Solr for efficient fulltext search.

A couple of days ago we reinstalled our Solr server on a new machine. After adjusting our Dovecot setup to use the new server, it took a few days to notice that something seems fishy about our full-text search: expected hits wouldn't be shown among the search results.

For instance, one of the folders (a shared, read-only folder which is basically a mailinglist archive) with about 210k messages has a plaintext mail with the text 'Amman'. However, logging into the IMAP server and issueing a

  . SEARCH TEXT Amman

In the folder doesn't yield any hits. It seems that this happens for older mails only -- trying other keywords, we did notice hits in recent mails but not in older ones. Some caching related to the old Solr server causing issues?

Debugging this further, I noticed that the above IMAP command shows this in the Solr log files:

INFO: [] webapp=/solr path=/select params={fl=uid,score&sort=uid+asc&q=(hdr:"Amman"+OR+body:"Amman")&fq=%2Bbox:b68ece09e22fb9502d34010017227a26+%2Buser:""&rows=209392} hits=0 status=0 QTime=229

And indeed, something like

$ curl 'http://indexer:8080/solr/select?fl=uid,score&sort=uid+asc&q=(hdr:"Amman"+OR+body:"Amman")&fq=%2Bbox:b68ece09e22fb9502d34010017227a26+%2Buser:""&rows=209392'

Yields no results. However, I noticed that if I remove the 'fq=' part from the query then I get a bunch of hits. Alas, I don't know whether those are to be expected or not.

Does anybody have an idea what might cause this, or what the meaning of that 'box' checksum is?

--
Frerich Raabe - ra...@froglogic.com
www.froglogic.com - Multi-Platform GUI Testing

Reply via email to