Hi Vincent,
thanks for your investigations!
Il 01/09/21 11:27, Vincent Brillault ha scritto:
Dear all,
Just a status update, in case this can help others.
We went forward and disabled the position information indexing and the
re-indexed of our mail data (over a couple of days to avoid
overloading the systems). Before the re-indexing we had 1.33 TiB in
our Solr Indexes. After re-indexation, we had only 542 GiB, that's a
60% of our storage requirements for our FTS indexes :)
this optimization also produce a less RAM requirements on Solr server?
So far, we haven't been reported any issue or measurable differences
by our users concerning the quality of the FTS. From further
debugging, as discussed on the solr-user mailing list
(https://lists.apache.org/thread.html/rcdf8bb97be0839e57928ad5fa34501ec8a73392c11248db91206bc33%40%3Cusers.solr.apache.org%3E),
I've come to the conclusion that, with the current integration between
Dovecot and Solr (esp the fact that `"` is escaped), it's impossible
to trigger phrase queries from user queries as long as
autoGeneratePhraseQueries is false.
I've attached the schema.xml and solrconfig.xml we are now using with
Solr 8.6.0, in case there is any interest from others. Let me know if
you prefer a MR to update the xmls present in
https://github.com/dovecot/core/tree/master/doc.
The attached schema and config file also works with Solr 7.7.0? Since
dovecot provide schema and config for 7.7.0 will be useful for many of
us a path based on it.
Thanks
--
Alessio Cecchi
Postmaster @ http://www.qboxmail.it
https://www.linkedin.com/in/alessice