Am 04.07.2016 um 11:59 schrieb Diana Scannicchio:
Hello Andreas,
unfortunately we cannot upgrade to versions that are not in the official CERN 
release of the OS.
No I did not try the tuning advisors yopu suggested
I noticed that there was a bug report for apparently the same issue
https://dev.icinga.org/issues/10731
and that it was theoretically solved in version 2.4.4.
Could it be that it reappeared or was not completely solved?

There are some known performance issues with 2.4.x and DB IDO. Some will
be resolved by adding additional database indexes. You may manually
apply those from 2.5.0.sql from git master if you're brave and cannot wait.

We're currently investigating on a possible lockup in large environments
hitting the query queue limit of 1mio tasks. That's currently holding
off the 2.5 release (albeit there is not much info yet, we're directly
working with a partner customer here). Your log looks quite the same as
we saw it yesterday on their environment. Should be quite the same as
this issue - https://dev.icinga.org/issues/12147

Another thing you can certainly look into - enable the icinga2 debug log
and extract all "IdoMysqlConnection" strings. That'll provide an idea
about the fired queries.

If debug log is not an option, change the ido-mysql config "host" from
"localhost" to "127.0.0.1" to enforce a tcp socket connection. That way
you'll be able to run tcpdump on port 3306 and analyse the queries with
wireshark later on.

One thing we always ask for when looking into possible locks - a gdb
backtrace. Details over here:
http://docs.icinga.org/icinga2/latest/doc/module/icinga2/toc#!/icinga2/latest/doc/module/icinga2/chapter/development#development-debug-gdb-backtrace

Kind regards,
Michael

I run the script you suggested on a temporary VM that can be destroyed and I 
got the following

-------- Recommendations 
---------------------------------------------------------------------------
General recommendations:
     Add skip-innodb to MySQL configuration to disable InnoDB
     Enable the slow query log to troubleshoot bad queries
     Configure your accounts with ip or subnets only, then update your 
configuration with skip-name-resolve=1
Variables to adjust:
     query_cache_size (>= 8M)


Thank you very much,

Diana


On 01 Jul 2016, at 16:42, Andreas Lehr <m...@andreas-lehr.com> wrote:

Have you tried to usual MySQL tuning advisors?

http://mysqltuner.pl
https://github.com/RootService/tuning-primer

Is it possible to upgrade to a more recent MySQL version? Maybe 5.5 or better 
5.6 or even 5.7?

What about Disk I/O on the host? do you have I/O waits?


2016-07-01 15:45 GMT+02:00 Diana Scannicchio <diana.scannicc...@cern.ch>:
Hello Antony,
so this is interesting…  I stopped icinga2 service and mysqld, I added in the 
[mysqld] section of  /etc/my.cnf the following two lines

slow-query-log = 1
slow-query-log-file = /var/log/mysql-slow-query.log

Then I touched  /var/log/mysql-slow-query.log and change the ownership to mysqld

Finally I restarted mysqld and then icinga2 at 15.00.

The file  /var/log/mysql-slow-query.log after more than 20 minutes was still 
“empty”, so I executed

[root  log]# tail -f mysql-slow-query.log
mysql> SELECT SLEEP(10);
+-----------+
| SLEEP(10) |
+-----------+
|         0 |
+-----------+
1 row in set (10.01 sec)

  which ended up into the log:

/usr/libexec/mysqld, Version: 5.1.73-log (Source distribution). started with:
Tcp port: 0  Unix socket: /var/lib/mysql/mysql.sock
Time                 Id Command    Argument


# Time: 160701 15:27:27
# User@Host: root[root] @ localhost []
# Query_time: 10.000245  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=1467379647;
SELECT SLEEP(10);

Please note that I am not (yet ;) ) a DB expert

The icinga2.log contains the following…

[2016-07-01 15:03:51 +0200] information/IdoMysqlConnection: Query queue items: 
505748, query rate: 0.683333/s (41/min 41/5min 41/15min); empty in infinite 
time, your database isn't able to keep up
[2016-07-01 15:04:06 +0200] information/IdoMysqlConnection: Query queue items: 
593751, query rate: 0.683333/s (41/min 41/5min 41/15min); empty in infinite 
time, your database isn't able to keep up
[2016-07-01 15:04:21 +0200] information/IdoMysqlConnection: Query queue items: 
645002, query rate: 0.683333/s (41/min 41/5min 41/15min); empty in infinite 
time, your database isn't able to keep up
[2016-07-01 15:04:27 +0200] information/Checkable: Notifications are disabled 
for service 'vs-atlas-cr-52!pcoip/pingtime'.
[2016-07-01 15:04:36 +0200] information/IdoMysqlConnection: Query queue items: 
713414, query rate: 0.683333/s (41/min 41/5min 41/15min); empty in infinite 
time, your database isn't able to keep up
[2016-07-01 15:04:51 +0200] information/IdoMysqlConnection: Query queue items: 
802373, query rate: 2027.68/s (121661/min 121686/5min 121686/15min); empty in 
infinite time, your database isn't able to keep up
[2016-07-01 15:05:06 +0200] information/IdoMysqlConnection: Query queue items: 
853117, query rate: 2027.68/s (121661/min 121686/5min 121686/15min); empty in 
infinite time, your database isn't able to keep up
[2016-07-01 15:05:21 +0200] information/IdoMysqlConnection: Query queue items: 
886383, query rate: 2027.68/s (121661/min 121686/5min 121686/15min); empty in 
infinite time, your database isn't able to keep up
[2016-07-01 15:05:27 +0200] information/Checkable: Notifications are disabled 
for service 'vs-atlas-cr-52!pcoip/pingtime'.
[2016-07-01 15:05:36 +0200] information/IdoMysqlConnection: Query queue items: 
955106, query rate: 2027.68/s (121661/min 121686/5min 121686/15min); empty in 
infinite time, your database isn't able to keep up

Do you have any further suggestion ? am I doing something wrong?
Thank you very much,

Diana



On 01 Jul 2016, at 11:29, Diana Scannicchio <diana.scannicc...@cern.ch> wrote:

Hello Antony,
so both the machines (they are identical) have 2 CPU (CPU E5-2620 v3 @ 2.40GHz) 
with 6 core per processor, so a total of 12 CPU cores.
I assumed that the test machine is swapping a lot due to MySQL not able to cope.
The majority of the process are run by root and they are on both nodes.
Please note that the production node does not have the issue

The icinga2 database has been created in MySQL following the instructions found 
in
http://docs.icinga.org/icinga2/latest/doc/module/icinga2/toc#!/icinga2/latest/doc/module/icinga2/chapter/getting-started#installing-icingaweb2
section "2.5.1.3. Setting up the MySQL database"

The content of /etc/my.cnf is pasted below.
I will try to enable the slow query log and see if I spot something as you 
suggested.
Still there should be some difference between v2.3.11 and v2.4.* that is 
preventing the to run with the latter or something to be modified when updating.

Thank you,

DIana


#  cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0


## Other tuning, from standard setupi added by Diana - 26 Apr 2015, 1 June 2016 
on this node
## http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html
# Set buffer pool size to 50-80% of your computer's memory
# This was already set 8 GB
# show variables like 'innodb_%' : innodb_buffer_pool_size 8388608
innodb_buffer_pool_size         = 8G
# Set the log file size to about 25% of the buffer pool size
innodb_log_file_size            = 512M
innodb_log_buffer_size          = 128M

innodb_file_per_table = 1
max_allowed_packet = 32M

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid



On 30 Jun 2016, at 16:46, Antony Stone <antony.st...@icinga.open.source.it> 
wrote:

On Thursday 30 June 2016 at 15:23:19, Diana Scannicchio wrote:

below you can see what I get by connecting to MySQL as a privileged user and
asking it to "show processlist” on both the nodes (one running v2.4.10 and
the second one running v2.3.11. And also the output of top.
Well, the first interestng bit is on the test machine, where MySQL is busy
"logging slow query", so I do think that turning on the slow query log in your
MySQL my.cnf should shed some further light on this problem.

Secondly I see from 'top' that mysqld is using 67.4% CPU and 70.4% memory,
which is certainly excessive for this activity.

The test machine is also using ~128Mbytes swap space, which is not excessive,
but interesting given that the production machine is using none.

I see a load average of 7.74 - how many CPU cores are there on the test
machine?

Finally, you have 964 processes running (again, on the test machine - it seems
to show the problem more extremely than the production machine does) - can you
identify from ps what the majority of these are?

The only other question I have is "how was the icinga2 database created in
MySQL on these machines?"  It does look to me rather as though missing indexes
could be the problem, but I think the output of the slow query log should help
with that.


Regards,


Antony.

--
#define SIX 1+5
#define NINE 8+1

int main() {
   printf("%d\n", SIX * NINE);
}
      - thanks to ECB for bringing this to my attention

                                                  Please reply to the list;
                                                        please *don't* CC me.
_______________________________________________
icinga-users mailing list
icinga-users@lists.icinga.org
https://lists.icinga.org/mailman/listinfo/icinga-users
-
Diana Scannicchio
University of California, Irvine
ATLAS TDAQ SysAdmin group
Office: +41 22 76 75240
OnCall: 164851






_______________________________________________
icinga-users mailing list
icinga-users@lists.icinga.org
https://lists.icinga.org/mailman/listinfo/icinga-users
-
Diana Scannicchio
University of California, Irvine
ATLAS TDAQ SysAdmin group
Office: +41 22 76 75240
OnCall: 164851






_______________________________________________
icinga-users mailing list
icinga-users@lists.icinga.org
https://lists.icinga.org/mailman/listinfo/icinga-users



--
Andreas Lehr
twitter.com/shakalandy
m...@andreas-lehr.com


_______________________________________________
icinga-users mailing list
icinga-users@lists.icinga.org
https://lists.icinga.org/mailman/listinfo/icinga-users
-
Diana Scannicchio
University of California, Irvine
ATLAS TDAQ SysAdmin group
Office: +41 22 76 75240
OnCall: 164851






_______________________________________________
icinga-users mailing list
icinga-users@lists.icinga.org
https://lists.icinga.org/mailman/listinfo/icinga-users



-- 
Michael Friedrich, DI (FH)
Senior Developer

NETWAYS GmbH | Deutschherrnstr. 15-19 | D-90429 Nuernberg
Tel: +49 911 92885-0 | Fax: +49 911 92885-77
CEO: Julian Hein, Bernd Erk | AG Nuernberg HRB18461
http://www.netways.de | michael.friedr...@netways.de

** OSBConf 2016 - September - osbconf.org **
** OSMC 2016 - November - netways.de/osmc **
_______________________________________________
icinga-users mailing list
icinga-users@lists.icinga.org
https://lists.icinga.org/mailman/listinfo/icinga-users

Reply via email to