On 31.7.2012, at 22.46, to...@starbridge.org wrote: >>> 21500/59363doveadm(c...@spamguard.fr): Error: fts_solr: Invalid XML >>> input at line 1: mismatched tag >> No idea. You can reproduce this? What does it log with this patch? >> http://hg.dovecot.org/dovecot-2.1/rev/817b69b2b21f > > It happens every time on the same mailboxes (very few) around the same > uid number (I think I can find the exact uid with strace and send the > email message to you if it helps) > > catalina.out show this at this time: > > INFO: {} 0 1 > 31 juil. 2012 21:19:56 org.apache.solr.common.SolrException log > GRAVE: org.apache.solr.common.SolrException: Illegal character > ((CTRL-CHAR, code 4)) .. > After a quick google search , it seems related to invalid Control > Character sent to SOLR.
So it seems, but Dovecot already has code to filter out all control characters when sending data to Solr. I just looked through the source and did a few tests and I couldn't get it to send a control char to Solr. > I've applied your last patch and the message is now: > Error: fts_solr: Invalid XML input at 4:113: mismatched tag (near: > <html><head><title>Apache Tomcat/6.0.35 - Rapport > d'erreur</title><style><!--H1 > {font-family:Tahoma,Arial,sans-serif;color:white) I don't get this either. Instead I get a clean error (if I explicitly change the code to allow control chars): Jul 31 23:41:14 indexer-worker(tss 16345 ): Error: fts_solr: Indexing failed: 400 Illegal character ((CTRL-CHAR, code 4)) at [row,col {unknown-source}]: [858,254]