Re: ActiveMQ 5.10.0 queue slowed down, restart helped

mo Mon, 23 Feb 2015 06:59:02 -0800

Hi Tim,

thanks for taking an interest.


This is the table's description:

amq=> \d activemq_msgs
          Tabelle „public.activemq_msgs“
    Spalte   |          Typ           | Attribute
------------+------------------------+-----------
  id         | bigint                 | not null
  container  | character varying(250) |
  msgid_prod | character varying(250) |
  msgid_seq  | bigint                 |
  expiration | bigint                 |
  msg        | bytea                  |
  priority   | bigint                 |
  xid        | character varying(250) |
Indexe:
     "activemq_msgs_pkey" PRIMARY KEY, btree (id)
     "activemq_msgs_cidx" btree (container)
     "activemq_msgs_eidx" btree (expiration)
     "activemq_msgs_idx" btree (msgid_prod)
     "activemq_msgs_midx" btree (msgid_prod, msgid_seq)
     "activemq_msgs_pidx" btree (priority)
     "activemq_msgs_xidx" btree (xid)

Running an explain I get...

amq=> explain SELECT ID, PRIORITY FROM ACTIVEMQ_MSGS WHERE
MSGID_PROD='ID:tomcat10-XXX-41356-1422538681150-1:95156:1:1' AND
MSGID_SEQ='1' AND  CONTAINER='queue://XXX_export';
                                                        QUERY PLAN 

------------------------------------------------------------------------------------------------------------------------
  Index Scan using activemq_msgs_cidx on activemq_msgs  (cost=0.42..8.45 
rows=1 width=16)
    Index Cond: ((container)::text = 'queue://XXX_export'::text)
    Filter: (((msgid_prod)::text = 
'ID:tomcat10-XXX-41356-1422538681150-1:95156:1:1'::text) AND (msgid_seq 
= 1::bigint))
(3 Zeilen)

I think the Filter here could be problematic. Though I'm not sure why it 
is not using activemq_msgs_idx or activemq_msgs_midx.

When I issue the same type of query against the database while having a 
slow-down I get similarly slow results as does the activemq process. 
However, restarting the activemq and then issueing the same type of 
query (of course changing some parameters so no caching occurs) we see 
very fast responses.

On the database we always see 100% cpu usage on one core, by one 
process. There's no I/O issue as far as I can tell.

One more hint: We have two queues that usually get very big during these 
slow-downs, and the responses of the above statements scale roughly 
linearly to their size. Just to give you an idea .. queue "R" might have 
3000 messages and 3 seconds per above statement queue "B" might have 
2000 messages and about 2 seconds per above statement. So it does look 
very much like the filter is the issue .. but the thing still throwing 
me off is simply that an activemq-restart fixes the issue. After that, 
the very same statements run fast.

best regards,
Mark



On 02/23/2015 02:49 PM, Tim Bain [via ActiveMQ] wrote:
> Mark,
>
> You say the indices are OK; can you describe them for us, and can you find
> out the execution plan for the query?  Also, if you issue the same query
> directly against the database when this is happening, is that also slow?
> I'm looking for whether the query itself is slow or the query is fast but
> the surrounding ActiveMQ code is slow.
>
> Also, have you looked to see if any computing resources (CPU, disk I/O,
> network I/O, etc.) are heavily taxed on any of the machines involved (the
> broker and the database server; any others?)?  Getting an idea of the
> limiting resource might help figure out the problem.
>
> Tim
> On Feb 17, 2015 6:08 AM, "Mark Schmitt | Intratop" <[hidden email]
> </user/SendEmail.jtp?type=node&node=4691891&i=0>>
> wrote:
>
>  > Hi,
>  >
>  > I work with Piotr on this issue. Let me try to provide some additional
>  > information on our slow-down issue:
>  >
>  > Storage is a PostgreSQL Server 9.3.2 on a Debian Wheezy / Kernel
> 3.2.51-1
>  > System.
>  >
>  > We use JDBC and the PGPoolingDataSource
>  > (org.postgresql.ds.PGPoolingDataSource).
>  >
>  > This is the persistenceAdapter configuration:
>  >         <persistenceAdapter>
>  >             <jdbcPersistenceAdapter dataDirectory="activemq-data"
>  > dataSource="#postgres-ds" lockKeepAlivePeriod="0"
>  > createTablesOnStartup="false" />
>  >         </persistenceAdapter>
>  >
>  > We have 2 destination interceptors setup. And we run the demo code
>  > (jetty-demo) because we have some applications using the http/rest
>  > interface it provides. We don't run camel.
>  >
>  > Other than that it's a pretty mondane setup. And we also run two
> instances
>  > at the same time as a sort of fail-over. Because of the jdbc-backend,
> only
>  > one of them is active, and we use the failover protocol on clientside to
>  > use the active one. We use haproxy to serve the webinterface from the
>  > active instance. Both activemq-instances run on the same linux box, with
>  > different service ip-adresses. (they use the same binaries, only
>  > configuration and data directory are separated). The reason we run two
>  > instances is that we had big stability issues before, with the activemq
>  > process sort-of-hanging
>  > itself up. We could move away from that setup, because with 5.10 this
>  > hasn't happened.
>  >
>  > Like the database server, the linux box that runs the activemq
> instance is
>  > a Debian Wheezy Linux, but with Kernel 3.2.60-1+deb7u1.
>  >
>  > Problem description: Once in a while we see 100% cpu load on the
> database.
>  > We can isolate that to sql statements of the style:
>  >
>  > SELECT ID, PRIORITY FROM ACTIVEMQ_MSGS WHERE
> MSGID_PROD='ID:tomcat10-XXX-
>  > 41356-1422538681150-1:95156:1:1' AND MSGID_SEQ='1' AND
>  > CONTAINER='queue://XXX_export'
>  >
>  > These sql statements take more than 500ms. We've had scenarios where
> they
>  > took more than 3 seconds to complete. Queuesize for 500ms was ~1200
>  > messages for all queues (concentrated in one queue). With a
> production of
>  > about 2-3 Messages per seconds and a consumption of about 2 messages per
>  > second. Imho the queuesize and the query-time scales linearly.
>  >
>  > We were able to "resolve" the issue by restarting both activemq
> instances.
>  > After that, the load on the database drops dramatically, instead of 100%
>  > cpu usage we see less than 10% on the database and a very fast recovery.
>  > The ActiveMQ-Processes look fine too.
>  >
>  > My first quess was a missing database index, but they look fine.
> Besides,
>  > restarting the activemq instances resolves the issue .. which is very
> very
>  > weired for me .. I don't think it's a database lock either, because we
>  > couldn't see any and additionally, we see 100% cpu usage for the process
>  > executing the statement (postgres spawns a process per statement). That
>  > should imho (but I'm no database expect) not happen as well when
> there's a
>  > lock situation...
>  >
>  > We're at a loss. Do you guys have an idea?
>  >
>  > And one more thing: Once every two or three hours a lot of (several
>  > thousand) messages are created. But the above described problem is
>  > happening irregularly, every one or two weeks or so.
>  >
>  > Best regards,
>  > Mark
>  >
>
>
> ------------------------------------------------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-5-10-0-queue-slowed-down-restart-helped-tp4690706p4691891.html
>
> To unsubscribe from ActiveMQ 5.10.0 queue slowed down, restart helped,
> click here
> <http://activemq.2283324.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4690706&code=bWFyay5zY2htaXR0QGludHJhdG9wLmRlfDQ2OTA3MDZ8NTEzMTIwMDQ5>.
> NAML
> <http://activemq.2283324.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>

-- 
Mit freundlichen Grüßen

Mark Schmitt
--
intratop UG (haftungsbeschränkt)

Lise-Meitner-Straße 9
89081 Ulm

Telefon: +49-731-146603-70
Durchwahl: +49-731-146603-79
Telefax: +49-731-146603-72

E-Mail: mark.schm...@intratop.de

Vertreten durch: Herr Mark Oliver Schmitt


Registereintrag:
Eintragung im Handelsregister.
Registergericht: Amtsgericht Ulm
Registernummer: HRB 727676




--
View this message in context: 
http://activemq.2283324.n4.nabble.com/ActiveMQ-5-10-0-queue-slowed-down-restart-helped-tp4690706p4691897.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

Reply via email to