Does the last_update_date constraint filter out a lot of rows? In that
case the server may be reading a large number of rows, only to throw
them away since they get filtered out.
If you apply the filter on the client side, you shouldn't see timeouts
(but overall the process will be slower since you have to transfer more
data).
btw, from the logs it looks like the client is multi-threaded, there are
different token ranges in the same time period.
On 01/26/2018 10:39 PM, Juan Manuel Alonso wrote:
Hi guys,
I'm having some trouble while using paged queries on Cassandra's Java
driver (version 3.3.2). I'm using Cassandra 3.11.0.
I have to fetch a page of data from the DB, then make some trivial
changes, and then update these rows.
A simplified version of the code i'm running would be:
Integer rowCounter = 0;
Statement selectQuery =
QueryBuilder.select()...setFetchSize(pageSize)...;
ResultSet result =
cassandraSession.execute(selectQuery);
List<MyClass> mappedResults = new ArrayList<>();
for (Row row : result) {
rowCounter++;
mappedResults.add(map(row));
if (rowCounter % pageSize == 0) {
List<MyClass> resultsToUpdate =
modifyData(mappedResults);
for (MyClass resultToUpdate :
resultsToUpdate){
Statement updateQuery =
QueryBuilder.update(KEYSPACE, tableName)...;
cassandraSession.execute(query);
}
TimeUnit.SECONDS.sleep(sleepSeconds); //Sleep for a few seconds to let
the DB... breathe
}
}
I'm using consistency level ONE on both select and update queries, the
value of sleepSeconds is 5 and the pageSize is 47.
My problem is that I have to use very small page sizes, otherwise
queries start to timeout on Cassandra.
There is only one thread running this long update process, but when i
check Cassandra's debug.log, it looks like this:
...
DEBUG [ScheduledTasks:1] 2018-01-26 12:43:09,221
MonitoringTask.java:173 - 55 operations were slow in the last 5001 msecs:
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 1940709131428868672 AND token(id) <= 1976881771356013545
LIMIT 47>, time 1232 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -31240603717813337 AND token(id) <= 93066413544676618
LIMIT 47>, time 672 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -2746601914911102981 AND token(id) <= -2679503374406295369
LIMIT 47>, time 722 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 8697901506577253630 AND token(id) <= 8756251242481074941
LIMIT 47>, time 1737 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -2566441277217350674 AND token(id) <= -2410488306633473620
LIMIT 47>, time 997 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 5186947162422827855 AND token(id) <= 5251256039266177164
LIMIT 47>, time 1619 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 523415566358416448 AND token(id) <= 558165594730430519
LIMIT 47>, time 793 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -6313110054894254305 AND token(id) <= -6149701678898756666
LIMIT 47>, time 510 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 133117363640100699 AND token(id) <= 326755086351479456
LIMIT 47>, time 594 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -5773756298752768296 AND token(id) <= -5672224259310839216
LIMIT 47>, time 631 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 9138868762246577790 AND token(id) <= 9184809921750217730
LIMIT 47>, time 1680 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 1481347618188085389 AND token(id) <= 1529429375374220120
LIMIT 47>, time 1337 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -3179570044050246190 AND token(id) <= -2975237200717735765
LIMIT 47>, time 773 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -1992364373944487162 AND token(id) <= -1754930707218513982
LIMIT 47>, time 793 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 7461256765584395144 AND token(id) <= 7513523865647503158
LIMIT 47>, time 1569 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 2199511646454841639 AND token(id) <= 2235092311035533306
LIMIT 47>, time 1157 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 5981009014177068366 AND token(id) <= 6193847522724693984
LIMIT 47>, time 1549 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 6587824379305518475 AND token(id) <= 6941621185441223079
LIMIT 47>, time 1491 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -2888016351766682341 AND token(id) <= -2832466742668731344
LIMIT 47>, time 642 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 4599678137499867302 AND token(id) <= 4681791682494977137
LIMIT 47>, time 1222 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 4097891947569113599 AND token(id) <= 4205652216148641874
LIMIT 47>, time 1421 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 1313069522772322867 AND token(id) <= 1345209653063462051
LIMIT 47>, time 1042 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 687538582456175035 AND token(id) <= 714387656474794527
LIMIT 47>, time 1262 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 3166906839309766962 AND token(id) <= 3220931935607646481
LIMIT 47>, time 1589 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 5661754624946584572 AND token(id) <= 5764445628939515857
LIMIT 47>, time 1491 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > 6988693948300673349 AND token(id) <= 7029878484744478913
LIMIT 47>, time 1579 msec - slow timeout 500 msec/cross-node
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 AND
token(id) > -601653023540100942 AND token(id) <= -570711602903916229
LIMIT 47>, time 1032 msec - slow timeout 500 msec/cross-node
... (5 were dropped)
DEBUG [ScheduledTasks:1] 2018-01-26 12:43:14,222
MonitoringTask.java:173 - 52 operations were slow in the last 4998 msecs:
...
I can't understand why my process is making lots of SELECT queries in
just 5 seconds, when it should be making one fetch, updating the rows,
and then sleeping for five seconds. If i check my application logs,
everything seems to be correct, there is only one fetch every five
seconds.
I tried running the same code on another environment (with the same
volume of data). It runs much faster and the debug.log looks correct:
...
DEBUG [ScheduledTasks:1] 2018-01-26 12:32:25,574
MonitoringTask.java:173 - 1 operations were slow in the last 5003 msecs:
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 AND
token(id) > token(AVrJ6PqGIUe_WZqLXGLv) LIMIT 200>, time 517 msec -
slow timeout 500 msec/cross-node
DEBUG [ScheduledTasks:1] 2018-01-26 12:32:35,575
MonitoringTask.java:173 - 1 operations were slow in the last 5000 msecs:
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 AND
token(id) > token(AVrJ6GJqVlin-KDUvTKL) LIMIT 200>, time 680 msec -
slow timeout 500 msec/cross-node
DEBUG [ScheduledTasks:1] 2018-01-26 12:32:40,577
MonitoringTask.java:173 - 1 operations were slow in the last 5001 msecs:
<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 AND
token(id) > token(AVrJ6xVrVlin-KDUvXep) LIMIT 200>, time 647 msec -
slow timeout 500 msec/cross-node
...
Any thoughts on why is this happening would be highly appreciated.
Thanks in advance.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org