Re: fts_encoder

2021-02-11 Thread Joan Moreau
Created a PR https://github.com/dovecot/core/pull/155 On 2021-02-11 13:25, Joan Moreau wrote: Hello Checking further, and putting logs a bit every where in the dovecot code, the core is sending FIRST the initial document (not decoded) then SECOND the decoded version Thisi is really weird,

Re: fts_encoder

2021-02-11 Thread John Fawcett
On 11/02/2021 14:25, Joan Moreau wrote: > > Hello > > Checking further, and putting logs a bit every where in the dovecot > code, the core is sending FIRST the initial document (not decoded) > then SECOND the decoded version > > Thisi is really weird, and the indexer then indexes a lot of binary cr

Re: fts_encoder

2021-02-11 Thread Joan Moreau
Hello Checking further, and putting logs a bit every where in the dovecot code, the core is sending FIRST the initial document (not decoded) then SECOND the decoded version Thisi is really weird, and the indexer then indexes a lot of binary crap I am struggling to find where in the code this

Re: fts_encoder

2021-02-09 Thread John Fawcett
On 09/02/2021 15:33, Joan Moreau wrote: > > If I place the following code in the plugin > fts_backend_xxx_update_build_more function (lucene, squat and xapian, > as solr refuses to work properly on my setup) > >         { >                 char * s = i_strdup("EMPTY"); >                 if(data !=

Re: fts_encoder

2021-02-09 Thread Joan Moreau
If I place the following code in the plugin fts_backend_xxx_update_build_more function (lucene, squat and xapian, as solr refuses to work properly on my setup) { char * s = i_strdup("EMPTY"); if(data != NULL) { i_free(s); s = i_strndup(data,20); }

Re: fts_encoder

2021-02-09 Thread John Fawcett
On 08/02/2021 23:05, Stuart Henderson wrote: > On 2021/02/08 21:33, Joan Moreau wrote: >> Yes , once again : output of the decoder is fine, I also put log inide the >> dovecot core to >> check whether data is properly transmitted, and result is that it is (i.e. >> dovecot core >> receives the pro

Re: fts_encoder

2021-02-08 Thread Joan Moreau
Yes , once again : output of the decoder is fine, I also put log inide the dovecot core to check whether data is properly transmitted, and result is that it is (i.e. dovecot core receives the proper output of pdftotext via the decoder Now, that data is the /not/ the one sent from dovecot core

Re: fts_encoder

2021-02-08 Thread Joan Moreau
Yes , once again : output of the decoder is fine, I also put log inide the dovecot core to check whereas data is properly transmitted and it is (i.e. dovecot core receives the proper output of pdftotext via the decoder Now, that data is the /not/ the once ent from dovecot core to the fts plug

Re: fts_encoder

2021-02-08 Thread Joan Moreau
Well, in the function xxx_build_more of FTS plugin, the data received in the original PDF, not the output of pdftotext Can you clarify where do you put your log in the solr plugin , so I can check the situation in the xapian plugin ? On 2021-02-08 17:34, John Fawcett wrote: On 08/02/2021 15

Re: fts_encoder

2021-02-08 Thread Stuart Henderson
On 2021/02/08 21:33, Joan Moreau wrote: > Yes , once again : output of the decoder is fine, I also put log inide the > dovecot core to > check whether data is properly transmitted, and result is that it is (i.e. > dovecot core > receives the proper output of pdftotext via the decoder > > Now, th

Re: fts_encoder

2021-02-08 Thread John Fawcett
On 08/02/2021 21:35, Joan Moreau wrote: > > Well, in the function xxx_build_more of FTS plugin, the data received > in the original PDF, not the output of pdftotext > > Can you clarify where do you put your log in the solr plugin , so I > can check the situation in the xapian plugin ? > I used the

Re: fts_encoder

2021-02-08 Thread Stuart Henderson
On 2021-02-08, Joan Moreau wrote: > Well, in the function xxx_build_more of FTS plugin, the data received in > the original PDF, not the output of pdftotext > > Can you clarify where do you put your log in the solr plugin , so I can > check the situation in the xapian plugin ? The log is partic

Re: fts_encoder

2021-02-08 Thread Joan Moreau
Well, thank you for the answer, but the actual issue is that data sent by the decoder (stipulated in the conf file) is properly collected by dovecot core, but /not/ sent to the plugin : the plugin receives the original data. This is not linked to a particular plugin (xapian, solr, squat, etc..

Re: fts_encoder

2021-02-08 Thread John Fawcett
On 08/02/2021 15:22, Joan Moreau wrote: > > Well, thank you for the answer, but the actual issue is that data sent > by the decoder (stipulated in the conf file) is properly collected by > dovecot core, but /not/ sent to the plugin : the plugin receives the > original data. > > This is not linke

Re: fts_encoder

2021-02-07 Thread John Fawcett
On 07/02/2021 18:51, Joan Moreau wrote: > > more info : the function fts_parser_script_more in > plugins/fts/fts-parser.c properly read the output of the script > > still, the data is not sent to the FTS pligins (xapian or any other) > > > > On 2021-02-07 17:37, Joan Moreau wrote: > >> more info :

Re: fts_encoder

2021-02-07 Thread Joan Moreau
more info : the function fts_parser_script_more in plugins/fts/fts-parser.c properly read the output of the script still, the data is not sent to the FTS pligins (xapian or any other) On 2021-02-07 17:37, Joan Moreau wrote: more info : I am running dovecot git version On 2021-02-07 17:15, Jo

Re: fts_encoder

2021-02-07 Thread Joan Moreau
more info : I am running dovecot git version On 2021-02-07 17:15, Joan Moreau wrote: a bit more on this, adding log in the decode2text.sh, I can see that pdftotext output the right data, but that data is /not/ transmitted to the fts plugin for indexing (only the original pdf code is) On 2021

Re: fts_encoder

2021-02-07 Thread Joan Moreau
a bit more on this, adding log in the decode2text.sh, I can see that pdftotext output the right data, but that data is /not/ transmitted to the fts plugin for indexing (only the original pdf code is) On 2021-02-07 17:00, Joan Moreau wrote: Hello, I am trying to deal properly with email attac