Re: Slow paging query on Cassandra.

Avi Kivity Sat, 27 Jan 2018 03:59:06 -0800

Does the last_update_date constraint filter out a lot of rows? In thatcase the server may be reading a large number of rows, only to throwthem away since they get filtered out.

If you apply the filter on the client side, you shouldn't see timeouts(but overall the process will be slower since you have to transfer moredata).

btw, from the logs it looks like the client is multi-threaded, there aredifferent token ranges in the same time period.



On 01/26/2018 10:39 PM, Juan Manuel Alonso wrote:

Hi guys,
I'm having some trouble while using paged queries on Cassandra's Javadriver (version 3.3.2). I'm using Cassandra 3.11.0.
I have to fetch a page of data from the DB, then make some trivialchanges, and then update these rows.
A simplified version of the code i'm running would be:

                    Integer rowCounter = 0;
Statement selectQuery =QueryBuilder.select()...setFetchSize(pageSize)...; ResultSet result =cassandraSession.execute(selectQuery);
                    List<MyClass> mappedResults = new ArrayList<>();
                    for (Row row : result) {
                        rowCounter++;
                        mappedResults.add(map(row));
                        if (rowCounter % pageSize == 0) {
List<MyClass> resultsToUpdate =modifyData(mappedResults); for (MyClass resultToUpdate :resultsToUpdate){ Statement updateQuery =QueryBuilder.update(KEYSPACE, tableName)...;
cassandraSession.execute(query);
                            }
TimeUnit.SECONDS.sleep(sleepSeconds); //Sleep for a few seconds to letthe DB... breathe
                        }

                    }
I'm using consistency level ONE on both select and update queries, thevalue of sleepSeconds is 5 and the pageSize is 47.
My problem is that I have to use very small page sizes, otherwisequeries start to timeout on Cassandra.
There is only one thread running this long update process, but when icheck Cassandra's debug.log, it looks like this:
...
DEBUG [ScheduledTasks:1] 2018-01-26 12:43:09,221MonitoringTask.java:173 - 55 operations were slow in the last 5001 msecs:<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 1940709131428868672 AND token(id) <= 1976881771356013545LIMIT 47>, time 1232 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -31240603717813337 AND token(id) <= 93066413544676618LIMIT 47>, time 672 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -2746601914911102981 AND token(id) <= -2679503374406295369LIMIT 47>, time 722 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 8697901506577253630 AND token(id) <= 8756251242481074941LIMIT 47>, time 1737 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -2566441277217350674 AND token(id) <= -2410488306633473620LIMIT 47>, time 997 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 5186947162422827855 AND token(id) <= 5251256039266177164LIMIT 47>, time 1619 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 523415566358416448 AND token(id) <= 558165594730430519LIMIT 47>, time 793 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -6313110054894254305 AND token(id) <= -6149701678898756666LIMIT 47>, time 510 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 133117363640100699 AND token(id) <= 326755086351479456LIMIT 47>, time 594 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -5773756298752768296 AND token(id) <= -5672224259310839216LIMIT 47>, time 631 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 9138868762246577790 AND token(id) <= 9184809921750217730LIMIT 47>, time 1680 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 1481347618188085389 AND token(id) <= 1529429375374220120LIMIT 47>, time 1337 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -3179570044050246190 AND token(id) <= -2975237200717735765LIMIT 47>, time 773 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -1992364373944487162 AND token(id) <= -1754930707218513982LIMIT 47>, time 793 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 7461256765584395144 AND token(id) <= 7513523865647503158LIMIT 47>, time 1569 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 2199511646454841639 AND token(id) <= 2235092311035533306LIMIT 47>, time 1157 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 5981009014177068366 AND token(id) <= 6193847522724693984LIMIT 47>, time 1549 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 6587824379305518475 AND token(id) <= 6941621185441223079LIMIT 47>, time 1491 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -2888016351766682341 AND token(id) <= -2832466742668731344LIMIT 47>, time 642 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 4599678137499867302 AND token(id) <= 4681791682494977137LIMIT 47>, time 1222 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 4097891947569113599 AND token(id) <= 4205652216148641874LIMIT 47>, time 1421 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 1313069522772322867 AND token(id) <= 1345209653063462051LIMIT 47>, time 1042 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 687538582456175035 AND token(id) <= 714387656474794527LIMIT 47>, time 1262 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 3166906839309766962 AND token(id) <= 3220931935607646481LIMIT 47>, time 1589 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 5661754624946584572 AND token(id) <= 5764445628939515857LIMIT 47>, time 1491 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > 6988693948300673349 AND token(id) <= 7029878484744478913LIMIT 47>, time 1579 msec - slow timeout 500 msec/cross-node<SELECT * FROM keyspace.table WHERE last_update_date = 2017-04-07 ANDtoken(id) > -601653023540100942 AND token(id) <= -570711602903916229LIMIT 47>, time 1032 msec - slow timeout 500 msec/cross-node
... (5 were dropped)
DEBUG [ScheduledTasks:1] 2018-01-26 12:43:14,222MonitoringTask.java:173 - 52 operations were slow in the last 4998 msecs:
...
I can't understand why my process is making lots of SELECT queries injust 5 seconds, when it should be making one fetch, updating the rows,and then sleeping for five seconds. If i check my application logs,everything seems to be correct, there is only one fetch every fiveseconds.
I tried running the same code on another environment (with the samevolume of data). It runs much faster and the debug.log looks correct:
...
DEBUG [ScheduledTasks:1] 2018-01-26 12:32:25,574MonitoringTask.java:173 - 1 operations were slow in the last 5003 msecs:<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 ANDtoken(id) > token(AVrJ6PqGIUe_WZqLXGLv) LIMIT 200>, time 517 msec -slow timeout 500 msec/cross-nodeDEBUG [ScheduledTasks:1] 2018-01-26 12:32:35,575MonitoringTask.java:173 - 1 operations were slow in the last 5000 msecs:<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 ANDtoken(id) > token(AVrJ6GJqVlin-KDUvTKL) LIMIT 200>, time 680 msec -slow timeout 500 msec/cross-nodeDEBUG [ScheduledTasks:1] 2018-01-26 12:32:40,577MonitoringTask.java:173 - 1 operations were slow in the last 5001 msecs:<SELECT * FROM keyspace.table WHERE last_update_date = 2017-03-13 ANDtoken(id) > token(AVrJ6xVrVlin-KDUvXep) LIMIT 200>, time 647 msec -slow timeout 500 msec/cross-node
...

Any thoughts on why is this happening would be highly appreciated.

Thanks in advance.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Slow paging query on Cassandra.

Reply via email to