It seems, that this system is overloaded with active connections.

>2016-10-05 11:47:41 [Main_Thread] Info: Main_Thread freed by interrupted
Worker_3 in 2.972 seconds - got (ok)

Even this time value is ten times higher than expected.

>2016-10-05 11:47:41 [Worker_3] Info: Worker_3 freed Main_Thread - 170

The '170' is the filehandle number in perl (counted from 1.....) - this 
implies 70 or more active connections.

>2016-10-05 11:47:38 [Main_Thread] Info: Main_Thread will wait (max 30 s)
for the answer of Worker_3 which handles 12 sockets

This implies the same, if the equal balancing of all connections over all 
seven workers works correctly.

>2016-10-05 11:47:43 [Worker_3] 109.168.50.75 disconnected:
session:7FF3AA6E9A28 109.168.50.75 - processing time 2 seconds
>2016-10-05 11:47:43 [Worker_3] SC-Time Worker_3: 0.103252172470093
>2016-10-05 11:47:43 [Worker_3] SC-Time Worker_3: 0.0278699398040771
>2016-10-05 11:47:43 [Worker_3] SC-Time Worker_3: 0.0423040390014648

The processing time itself seems to be normal. SC-Time - is the time 
required to process one loop (read + queue or read + check + queue) for 
one active connection.

Assuming an average of 4 second life time per connection - the workload 
per day would be 1.5 million (70 / 4 *3600 *24) connections based on the 
provided output.
Configured to an ISP mode the avg. maximum for a single assp instance is 
800.000 connections. To be able to handle workload peeks, the 
configuration should not allow more than 400.000 connection a day.

ISP mode:
- 16 CPU cores
- 16 Workers
- 24 GB RAM
- dedicated systems for MTA, enterprise database, ClamAV
- HMM and spamDB holded unshared in RAM
- high performance DNS servers
- limited DNS usage (disable most extensive DNS-using checks - eg. URIBL)
- no OCR
- no DCC
- no Razor2

>2016-10-05 11:48:13 [Main_Thread] Info: Main_Thread freed by interrupted
Worker_3 in 31.940 seconds - got (ok)

This time value is near the end of the line. At this time a small count of 
additionally connections can lead in to an assp shutdown, because the 
workers don't accept new connections.

Thomas



Von:    cw <colin.war...@gmail.com>
An:     ASSP development mailing list <assp-test@lists.sourceforge.net>
Datum:  05.10.2016 13:15
Betreff:        Re: [Assp-test] unable to detect any running worker



Thanks.

I've had both servers come up against unable to detect any running worker
since clearing out all the files suggested. So I'm getting 16279 running
now.

I noticed the startup with those files removed was really quick, starting
back up the second time took several minutes so presumably reading those
files during startup takes a little while versus creating them fresh.

I've caught one already and traced it through the new logs:

016-10-05 11:47:38 [Main_Thread] Info: Main_Thread got connection request
2016-10-05 11:47:38 [Main_Thread] Info: Main_Thread looks up the best
Worker for new connection - 73
2016-10-05 11:47:38 [Main_Thread] Info: try to interrupt worker Worker_3
(12) for new connection
2016-10-05 11:47:38 [Main_Thread] Info: Main_Thread interrupted Worker_3
(12) to submit the connection
2016-10-05 11:47:38 [Main_Thread] Info: Main_Thread will wait (max 30 s)
for the answer of Worker_3 which handles 12 sockets
2016-10-05 11:47:41 [Worker_3] SC-Time Worker_3: 0.0462169647216797
2016-10-05 11:47:41 [Worker_3] Info: Worker_3 got connection from
MainThread - 73/73
2016-10-05 11:47:41 [Worker_3] Info: Worker_3 freed Main_Thread - 170
2016-10-05 11:47:41 [Main_Thread] Info: Main_Thread freed by interrupted
Worker_3 in 2.972 seconds - got (ok)
2016-10-05 11:47:41 [Worker_3] Connected: session:7FF3AA6E9A28
109.168.50.75:41612 > 92.63.138.65:25 > 127.0.0.1:125
2016-10-05 11:47:43 [Worker_3] SC-Time Worker_3: 0.103252172470093
2016-10-05 11:47:43 [Worker_3] SC-Time Worker_3: 0.0278699398040771
2016-10-05 11:47:43 [Worker_3] 109.168.50.75 [SMTP Reply] 220
mail2.smtphost.co.uk ESMTP Exim 4.86_2 Ubuntu Wed, 05 Oct 2016 11:47:41
+0100
2016-10-05 11:47:43 [Worker_3] SC-Time Worker_3: 0.0423040390014648
2016-10-05 11:47:43 [Worker_3] 109.168.50.75 SC-Time Worker_3:
0.0296478271484375
2016-10-05 11:47:43 [Worker_3] 109.168.50.75 disconnected:
session:7FF3AA6E9A28 109.168.50.75 - processing time 2 seconds
2016-10-05 11:48:13 [Main_Thread] Info: Main_Thread freed by interrupted
Worker_3 in 31.940 seconds - got (ok)

In this case, it looks like the connection ended without any actual data.
There is lots of activity reported by Worker_3 in between 11:47:43 and
11:48:13 but this all pertains to other connections that were already in
progress.

That is a bit different to the earlier one where a message was received 
and
the message completed within the 30s window.

I'm not seeing anything to help me figure out why though and I don't want
to simply post a big excerpt of the maillog.txt.

On Wed, Oct 5, 2016 at 11:02 AM, Thomas Eckardt 
<thomas.ecka...@thockar.com>
wrote:

> >I looked at SF but only see 16275 in test (updated 3 days ago) so maybe
> it
> hasn't made its way live yet.
>
> Sorry, my background CVS sync was not running - update is done.
>
> Thomas
>
>
>
>
> Von:    cw <colin.war...@gmail.com>
> An:     ASSP development mailing list <assp-test@lists.sourceforge.net>
> Datum:  05.10.2016 11:48
> Betreff:        Re: [Assp-test] unable to detect any running worker
>
>
>
> Thank you Thomas.
>
> useDB4IntCache - already set to off
> I've set WorkerLog to diagnostic and done the other steps.
> I don't have anything in CorrectASSPcfg.pm so as part of the
> troubleshooting I have previously deleted it and downloaded a fresh copy
> from SourceForge.
>
> I have to say that ASSP started up within seconds after clearing those
> files out - it normally takes several minutes and has always done.
>
> I looked at SF but only see 16275 in test (updated 3 days ago) so maybe 
it
> hasn't made its way live yet.
>
> On Wed, Oct 5, 2016 at 10:33 AM, Thomas Eckardt
> <thomas.ecka...@thockar.com>
> wrote:
>
> > I've provided an updated assp.pl (2.5.4 16279) in CVS /test. This
> version
> > shows some more information, if 'WorkerLog' is set to diagnostic.
> >
> > Thomas
> >
> >
> >
> >
> >
> > Von:    cw <colin.war...@gmail.com>
> > An:     ASSP development mailing list 
<assp-test@lists.sourceforge.net>
> > Datum:  05.10.2016 10:10
> > Betreff:        Re: [Assp-test] unable to detect any running worker
> >
> >
> >
> > Hi Thomas,
> >
> > Thanks for chipping in. All modules are installed by running the 
latest
> > mod_inst.pl. Crypt::GOST on Ubuntu actually has a bug in it so it
> requires
> > a minor edit to the code to get it to build. So that module was
> installed
> > by switching to /root/.cpan/build/Crypt-GOST-x-x-x and running:
> > make clean
> > perl Makefile.PL
> > make
> > make test
> > make install
> >
> > I run cpan-outdated -p|cpanm from time to time in order to keep 
modules
> up
> > to date as well so things don't stay stuck on old versions.
> >
> > Everything that is installed now is a completely fresh build as the
> > Upgrade
> > to 16.04 replaced perl 5.18 with 5.22 and all the modules therefore 
had
> to
> > be installed from scratch.
> >
> > I bypassed the issue by truncating the tables and setting up the users
> > again. With there only being one user it was easier to do that than 
muck
> > about with it - especially with the other bigger issue at hand.
> >
> > On Wed, Oct 5, 2016 at 8:59 AM, Thomas Eckardt
> > <thomas.ecka...@thockar.com>
> > wrote:
> >
> > > >to replace the corrupted encrypted
> > > strings with the correct values
> > >
> > > Was the 'Crypt::GOST' module from the SF download page at the old 
assp
> > > instance?
> > > If it was, did you install the 'Crypt::GOST' module from the SF
> download
> > > page, before you started the new assp instance?
> > >
> > > https://sourceforge.net/projects/assp/files/ASSP%20V2%
> > > 20multithreading/ASSP%20V2%20module%20installation/Crypt-GOST/
> > >
> > > Thomas
> > >
> > >
> > >
> > >
> > >
> > > Von:    cw <colin.war...@gmail.com>
> > > An:     ASSP development mailing list
> <assp-test@lists.sourceforge.net>
> > > Datum:  05.10.2016 09:52
> > > Betreff:        Re: [Assp-test] unable to detect any running worker
> > >
> > >
> > >
> > > Cheers for the reply.
> > >
> > > I don't think backups are an option seen as I've moved completely 
from
> > > Ubuntu 14.04 to Ubuntu 16.04.
> > >
> > > Also, this issue has been around for months. It was causing a 
handful
> of
> > > shutdowns a week with the occasional spat of more frequent 
shutdowns.
> It
> > > is
> > > entirely possible that the errors are behaviour related and nothing 
to
> > do
> > > with the upgrade and that current email behaviour is triggering it 
big
> > > style.
> > >
> > > I'm not convinced though, unfortunately I'm not convinced of 
anything
> > else
> > > hence not having much to go on. I can't see any consistencies in the
> > > behaviour leading up to the events.
> > >
> > > I don't think it is database related. It happened on one of the mail
> > > servers during the upgrade to 16.04 when ASSP had just started but I
> had
> > > not yet got into the web interface to replace the corrupted 
encrypted
> > > strings with the correct values so all database connections were in
> > error.
> > >
> > > The problem has already started again this morning so I can see this
> > being
> > > another fun day that either leads to a fix or having to put 
something
> > else
> > > in place.
> > >
> > > On Wed, Oct 5, 2016 at 4:34 AM, K Post <nntp.p...@gmail.com> wrote:
> > >
> > > > I've been reading here, but I haven't had anything to suggest. All
> > > seems
> > > > quite odd if it was working prior to upgrading and downgrading
> didn't
> > > work.
> > > >
> > > >
> > > > Could you spin up a backup of the installation after copying the
> > current
> > > > data?  Sure you'd have an older corpus, but I'd think you could 
add
> > the
> > > new
> > > > files if necessary, manually replace whitelist etc.
> > > >
> > > >
> > > > On Tue, Oct 4, 2016 at 5:48 PM, cw <colin.war...@gmail.com> wrote:
> > > >
> > > > > Further development on this today, very little.
> > > > > I have moved both servers onto Ubuntu 16.04 LTS which means 
going
> > from
> > > > perl
> > > > > 5.18 to 5.22 and rebuilding all perl modules from scratch.
> > > > >
> > > > > The admin user db did not work after the upgrade so I had to 
empty
> > the
> > > > > tables before it would come back online.
> > > > >
> > > > > I'm still getting delayed emails and assp shutting down telling 
me
> > it
> > > is
> > > > > unable
> > > > > to detect any running worker.
> > > > >
> > > > > If this goes on much longer the MD will pull the plug and we'll
> end
> > up
> > > > > moving to a third party solution which is not something I want 
but
> > if
> > > I
> > > > > can't fix it I can't defend it :/
> > > > >
> > > > > ------------------------------------------------------------
> > > > > ------------------
> > > > > Check out the vibrant tech community on one of the world's most
> > > > > engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> > > > > _______________________________________________
> > > > > Assp-test mailing list
> > > > > Assp-test@lists.sourceforge.net
> > > > > https://lists.sourceforge.net/lists/listinfo/assp-test
> > > > >
> > > > >
> > > >
> > > > ------------------------------------------------------------
> > > > ------------------
> > > > Check out the vibrant tech community on one of the world's most
> > > > engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> > > > _______________________________________________
> > > > Assp-test mailing list
> > > > Assp-test@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/assp-test
> > > >
> > > >
> > > ------------------------------------------------------------
> > > ------------------
> > > Check out the vibrant tech community on one of the world's most
> > > engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> > > _______________________________________________
> > > Assp-test mailing list
> > > Assp-test@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/assp-test
> > >
> > >
> > >
> > >
> > > DISCLAIMER:
> > > *******************************************************
> > > This email and any files transmitted with it may be confidential,
> > legally
> > > privileged and protected in law and are intended solely for the use 
of
> > the
> > >
> > > individual to whom it is addressed.
> > > This email was multiple times scanned for viruses. There should be 
no
> > > known virus in this email!
> > > *******************************************************
> > >
> > >
> > > ------------------------------------------------------------
> > > ------------------
> > > Check out the vibrant tech community on one of the world's most
> > > engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> > > _______________________________________________
> > > Assp-test mailing list
> > > Assp-test@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/assp-test
> > >
> > >
> > ------------------------------------------------------------
> > ------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Assp-test mailing list
> > Assp-test@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/assp-test
> >
> >
> >
> >
> > DISCLAIMER:
> > *******************************************************
> > This email and any files transmitted with it may be confidential,
> legally
> > privileged and protected in law and are intended solely for the use of
> the
> >
> > individual to whom it is addressed.
> > This email was multiple times scanned for viruses. There should be no
> > known virus in this email!
> > *******************************************************
> >
> >
> > ------------------------------------------------------------
> > ------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Assp-test mailing list
> > Assp-test@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/assp-test
> >
> >
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Assp-test mailing list
> Assp-test@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/assp-test
>
>
>
>
> DISCLAIMER:
> *******************************************************
> This email and any files transmitted with it may be confidential, 
legally
> privileged and protected in law and are intended solely for the use of 
the
>
> individual to whom it is addressed.
> This email was multiple times scanned for viruses. There should be no
> known virus in this email!
> *******************************************************
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Assp-test mailing list
> Assp-test@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/assp-test
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test




DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally 
privileged and protected in law and are intended solely for the use of the 

individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no 
known virus in this email!
*******************************************************

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to