Way to remove FTS indexes

2019-03-19 Thread Peter Mogensen via dovecot
Hi,

I was wondering if there was anyway to remove FTS indexes in other to
have them rebuild on the next BODY search?

All the doveadm commands I can find seem to result in fully build
indexes. (which is nice if that's what you want).

/Peter


Solr connection timeout hardwired to 60s

2019-04-04 Thread Peter Mogensen via dovecot
Hi,

What's the recommended way to handling timeouts on large mailboxes given
the hardwired request timeout of 60s in solr-connection.c:

   http_set.request_timeout_msecs = 60*1000;


/Peter




Re: Solr connection timeout hardwired to 60s

2019-04-04 Thread Peter Mogensen via dovecot



On 4/4/19 6:47 PM, dovecot-requ...@dovecot.org wrote:
> For a typical Solr index, 60 seconds is an eternity.  Most people aim
> for query times of 100 milliseconds or less, and they often achieve
> that goal.

I'm pretty sure I get these while indexing, not querying.

Apr 04 16:44:50 host dovecot[114690]: indexer-worker(m...@example.com):
Error: fts_solr: Indexing failed: Request timed out (Request queued
66.015 secs ago, 1 attempts in 66.005 secs, 63.146 in http ioloop, 0.000
in other ioloops, connected 94.903 secs ago)

/Peter


Re: Solr connection timeout hardwired to 60s

2019-04-10 Thread Peter Mogensen via dovecot



On 4/4/19 6:57 PM, Peter Mogensen wrote:
> 
> 
> On 4/4/19 6:47 PM, dovecot-requ...@dovecot.org wrote:
>> For a typical Solr index, 60 seconds is an eternity.  Most people aim
>> for query times of 100 milliseconds or less, and they often achieve
>> that goal.
> 
> I'm pretty sure I get these while indexing, not querying.
> 
> Apr 04 16:44:50 host dovecot[114690]: indexer-worker(m...@example.com):
> Error: fts_solr: Indexing failed: Request timed out (Request queued
> 66.015 secs ago, 1 attempts in 66.005 secs, 63.146 in http ioloop, 0.000
> in other ioloops, connected 94.903 secs ago)

Doing a TCP dump on indexing operations which consistently fail, I see
that there's a lot of softCommits which never get an HTTP answer:

==
POST /solr/dovebody/update HTTP/1.1
Host: localhost:8983
Date: Wed, 10 Apr 2019 14:22:29 GMT
Expect: 100-continue
Content-Length: 47
Connection: Keep-Alive
Content-Type: text/xml

HTTP/1.1 100 Continue





... in contrast to the first softCommit on the connection:


POST /solr/dovebody/update HTTP/1.1
Host: localhost:8983
Date: Wed, 10 Apr 2019 14:20:53 GMT
Expect: 100-continue
Content-Length: 47
Connection: Keep-Alive
Content-Type: text/xml

HTTP/1.1 100 Continue

HTTP/1.1 200 OK
Content-Type: application/xml; charset=UTF-8
Content-Length: 156





  0
  37


==

The missing softCommit responses seem to start right after the last
added document:
==

0

HTTP/1.1 200 OK
Content-Type: application/xml; charset=UTF-8
Content-Length: 156





  0
  12


POST /solr/dovebody/update HTTP/1.1
Host: localhost:8983
Date: Wed, 10 Apr 2019 14:22:29 GMT
Expect: 100-continue
Content-Length: 47
Connection: Keep-Alive
Content-Type: text/xml

HTTP/1.1 100 Continue


===

... and then the rest of the TCP dump doesn't get responses to
softCommit POSTs

/Peter


Re: Solr connection timeout hardwired to 60s

2019-04-12 Thread Peter Mogensen via dovecot


Looking further at tcpdumps of the Dovecot->Solr traffic and Solr
metrics it doesn't seem like there's anything suspicious apart from the
TCP windows running full and Dovecot backing of ... until it times out
and close the connection.

>From my understanding of how Dovecot operates towards Solr it will flush
~1000 documents towards Solr in /update request until it has traversed
the mailbox (let's say 20.000 mails), doing softCommits after each.

But is it really reasonable for Dovecot to expect that no request will
take more than 60s to process by Solr?
It doesn't seem like my Solr can handle that, although it does process
documents and it does reasonably fast clear pending documents after
Dovecot closes the connection.

On the surface it looks like Dovecot is too impatient.

/Peter

On 4/10/19 6:25 PM, Peter Mogensen wrote:
> 
> 
> On 4/4/19 6:57 PM, Peter Mogensen wrote:
>>
>>
>> On 4/4/19 6:47 PM, dovecot-requ...@dovecot.org wrote:
>>> For a typical Solr index, 60 seconds is an eternity.  Most people aim
>>> for query times of 100 milliseconds or less, and they often achieve
>>> that goal.
>>
>> I'm pretty sure I get these while indexing, not querying.
>>
>> Apr 04 16:44:50 host dovecot[114690]: indexer-worker(m...@example.com):
>> Error: fts_solr: Indexing failed: Request timed out (Request queued
>> 66.015 secs ago, 1 attempts in 66.005 secs, 63.146 in http ioloop, 0.000
>> in other ioloops, connected 94.903 secs ago)
> 
> Doing a TCP dump on indexing operations which consistently fail, I see
> that there's a lot of softCommits which never get an HTTP answer:
> 
> ==
> POST /solr/dovebody/update HTTP/1.1
> Host: localhost:8983
> Date: Wed, 10 Apr 2019 14:22:29 GMT
> Expect: 100-continue
> Content-Length: 47
> Connection: Keep-Alive
> Content-Type: text/xml
> 
> HTTP/1.1 100 Continue
> 
> 
> 





Re: dovecot Digest, Vol 192, Issue 52

2019-04-14 Thread Peter Mogensen via dovecot



On 4/14/19 4:04 PM, dovecot-requ...@dovecot.org wrote:

>> Solr ships with autoCommit set to 15 seconds and openSearcher set to
>> false on the autoCommit.? The autoSoftCommit setting is not enabled by
>> default, but depending on how the index was created, Solr might try to
>> set autoSoftCommit to 3 seconds ... which is WAY too short.

I just run with the default. 15s autoCommit and no autoSoftCommit

>> This thread says that dovecot is sending explicit commits.? 

I see explicit /update req. with softCommit and waitSearcer=true in a
tcpdump.

>> One thing
>> that might be happening to exceed 60 seconds is an extremely long
>> commit, which is usually caused by excessive cache autowarming, but
>> might be related to insufficient memory.? The max heap setting on an
>> out-of-the-box Solr install (5.0 and later) is 512MB.? That's VERY
>> small, and it doesn't take much index data before a much larger heap
>> is required.

I run with

SOLR_JAVA_MEM="-Xmx8g -Xms2g"

> I looked into the code (version 2.3.5.1):

This is 2.2.35. I haven't checked the source difference to 2.3.x I must
admit.

> I immagine that one of the reasons dovecot sends softCommits is because
> without autoindex active and even if mailboxes are periodically indexed
> from cron, the last emails received with be indexed at the moment of the
> search.? 

I expect that dovecot has to because of it's default behavior by only
bringing the index up-to-date just before search. So it has towait for
the index result to be available if there's been any new mails indexed.

> 1) a configurable batch size would enable to tune the number of emails
> per request and help stay under the 60 seconds hard coded http request
> timeout. A configurable http timeout would be less useful, since this
> will potentially run into other timeouts on solr side.

Being able to configure it is great.
But I don't think it solves much. I recompiled with 100 as batch size
and it still ended in timeouts.
Then I recompiled with 10min timeout and now I see all the batches
completing and their processesing time is mostly between 1 and 2 minutes
(so all would have failed).

To me it looks like Solr just takes too long time to index. This is no
small machine. It's a 20 core Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
and for this test it's not doing anything else, so I'm a bit surprised
that even with only a few users this takes so long time.

/Peter




Re: Solr connection timeout hardwired to 60s

2019-04-14 Thread Peter Mogensen via dovecot


sorry... I got distracted half way and forgot to put a meaningfull
subject so the archive could figure out the thread. - resending.

On 4/14/19 4:04 PM, dovecot-requ...@dovecot.org wrote:

>> Solr ships with autoCommit set to 15 seconds and openSearcher set to
>> false on the autoCommit.? The autoSoftCommit setting is not enabled by
>> default, but depending on how the index was created, Solr might try to
>> set autoSoftCommit to 3 seconds ... which is WAY too short.

I just run with the default. 15s autoCommit and no autoSoftCommit

>> This thread says that dovecot is sending explicit commits.? 

I see explicit /update req. with softCommit and waitSearcer=true in a
tcpdump.

>> One thing
>> that might be happening to exceed 60 seconds is an extremely long
>> commit, which is usually caused by excessive cache autowarming, but
>> might be related to insufficient memory.? The max heap setting on an
>> out-of-the-box Solr install (5.0 and later) is 512MB.? That's VERY
>> small, and it doesn't take much index data before a much larger heap
>> is required.

I run with

SOLR_JAVA_MEM="-Xmx8g -Xms2g"

> I looked into the code (version 2.3.5.1):

This is 2.2.35. I haven't checked the source difference to 2.3.x I must
admit.

> I immagine that one of the reasons dovecot sends softCommits is because
> without autoindex active and even if mailboxes are periodically indexed
> from cron, the last emails received with be indexed at the moment of the
> search.? 

I expect that dovecot has to because of it's default behavior by only
bringing the index up-to-date just before search. So it has towait for
the index result to be available if there's been any new mails indexed.

> 1) a configurable batch size would enable to tune the number of emails
> per request and help stay under the 60 seconds hard coded http request
> timeout. A configurable http timeout would be less useful, since this
> will potentially run into other timeouts on solr side.

Being able to configure it is great.
But I don't think it solves much. I recompiled with 100 as batch size
and it still ended in timeouts.
Then I recompiled with 10min timeout and now I see all the batches
completing and their processesing time is mostly between 1 and 2 minutes
(so all would have failed).

To me it looks like Solr just takes too long time to index. This is no
small machine. It's a 20 core Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
and for this test it's not doing anything else, so I'm a bit surprised
that even with only a few users this takes so long time.

/Peter




Auto rebuilding of Solr indexes on settings change?

2019-04-25 Thread Peter Mogensen via dovecot
Hi,

Looking at the source, it doesn't seem like fts-solr checks for settings
changes using fts_index_have_compatible_settings() like fts-lucene does.

Is there any special reason for why fts-solr shouldn't also rebuild
indexes if settings has changed?

/Peter


dsync and altpath on shared storage.

2019-09-02 Thread Peter Mogensen via dovecot
Hi,

I was wondering...

If one had mdbox ALT path set to a shared storage mount (say, on NFS)
and one wanted to move a mailbox to a different host... I guess it in
principle wouldn't be necessary to copy all the ALT storage through
dsync, when the volume could just be mounted on the new host.

Is there anyway for dsync to avoid moving Gigabytes of data for could
just be "moved" by moving the mount?

/Peter


Re: dsync and altpath on shared storage.

2019-09-03 Thread Peter Mogensen via dovecot



On 9/2/19 3:03 PM, Sami Ketola wrote:
>> On 2 Sep 2019, at 15.25, Peter Mogensen via dovecot  
>> wrote:
...
>> Is there anyway for dsync to avoid moving Gigabytes of data for could
>> just be "moved" by moving the mount?
> 
> 
> Not tested but you can probably do something like this in the target server:
> 
> doveadm backup -u victim -R ssh sudouser@old-server "sudo doveadm 
> dsync-server -o mail_location=sdbox:/location-to-your-sdbox/ -u victim"
> 
> just leave ALT storage path from the setting.


I'll have to test this... but my initial guess would be that doveadm
would then think the mails has disappeared. Would it then copy the index
metadata for those mails to the target host anyway?

/Peter


Re: dsync and altpath on shared storage.

2019-09-03 Thread Peter Mogensen via dovecot



On 9/3/19 2:38 PM, Sami Ketola wrote:
> 
> 
>> On 3 Sep 2019, at 15.34, Peter Mogensen via dovecot  
>> wrote:
>>
>>
>>
>> On 9/2/19 3:03 PM, Sami Ketola wrote:
>>>> On 2 Sep 2019, at 15.25, Peter Mogensen via dovecot  
>>>> wrote:
>> ...
>>>> Is there anyway for dsync to avoid moving Gigabytes of data for could
>>>> just be "moved" by moving the mount?
>>>
>>>
>>> Not tested but you can probably do something like this in the target server:
>>>
>>> doveadm backup -u victim -R ssh sudouser@old-server "sudo doveadm 
>>> dsync-server -o mail_location=sdbox:/location-to-your-sdbox/ -u victim"
>>>
>>> just leave ALT storage path from the setting.
>>
>>
>> I'll have to test this... but my initial guess would be that doveadm
>> would then think the mails has disappeared. Would it then copy the index
>> metadata for those mails to the target host anyway?
> 
> 
> Hmm. That is true. It will probably not work after all then. 
> 
> Now I'm out of ideas how to do this efficiently.

I assume it won't even work to just premount the shared storage
read-only on the target side, so the mails are already there.
... since I suppose the receiving dsync reserves the right to re-pack
the m.* storage files?

/Peter



Re: dsync and altpath on shared storage.

2019-09-04 Thread Peter Mogensen via dovecot


So... I've done some testing.

One method which seemed to work - at least for primitive cases - was to:

* Mount the ALT storage on the destination.
* Run "doveadm force-resync \*" on the destination.
  (putting all the mails in ALT storage into the dovecot.map.index)
* Run dsync from source to destination.

Of course... if there was some way to avoid step 2...

/Peter


Re: dsync and altpath on shared storage.

2019-09-05 Thread Peter Mogensen via dovecot



On 9/4/19 2:12 PM, Peter Mogensen wrote:
> 
> So... I've done some testing.
> 
> One method which seemed to work - at least for primitive cases - was to:
> 
> * Mount the ALT storage on the destination.
> * Run "doveadm force-resync \*" on the destination.
>   (putting all the mails in ALT storage into the dovecot.map.index)
> * Run dsync from source to destination.
> 
> Of course... if there was some way to avoid step 2...

So ... I have an idea.

Assuming users mail_location is:

mdbox:~/mdbox:ALT=/alt:INDEX=~/idx

And /alt is a shard mounted storage.

Then, it suspect the following steps would make dsync avoid transfering
mails on shared storage:

1) Create a rudimentary mdbox on the target side (just containing the
dbox-alt-root link)

2) Mount /alt on the target host

3) Copy all dovecot.index and dovecot.map.index in ~/idx from source to
target. That is: not the transaction (*.log) files or cache files.
I suppose this needs to be done under appropriate read locking.

4) doveadm sync -u source doveadm dsync-server -u target
  ... to get the rest of the mails in primary storage and all updates
sine the index files where snapshot.



It would be nice if there was a way to force dovecot*index.log files to
be snapshot to index files.

If the aim is not to sync two different accounts but to simply move one
account from one host to a new host where it doesn't exist in advance,
are there any caveats with this?

... apart from a few missing tools.

/Peter