Re: Cqlsh timeout and schema refresh exceptions

Saumitra S Mon, 19 Dec 2016 13:22:28 -0800

Hi Matija,

I know about the API. I am using isSchemaInAgreement, but problem is that
when I am giving CREATE TABLE requests quickly. its timing out after 30
seconds. Same CF creation happens successfully(with agreement) *within 2
seconds* when I give it in isolation on a fresh cluster, or after good
amount of interval. *Problem starts happening only when I issue too many
CREATE table requests quickly one after another*.


Please note, as I have mentioned above, at one time ONLY one CREATE query
is executed. I send next CREATE CF query only when previous CREATE CF
query's return isSchemaInAgreement = true.

I am trying to find out why CF creation is not able to finish schema
agreement in even 30 seconds? I can increase timeout, or just do manual
pooling with checkSchemaAgreement, but then CF creation time will be too
high for my application usecase. Even 30 seconds is too high.

Best Regards,
Saumitra



On Tue, Dec 20, 2016 at 2:32 AM, Matija Gobec <matija0...@gmail.com> wrote:

> There is an exposed API for schema agreement and I would advise you to use
> that if you can.
> Look at this JIRA ticket
> <https://datastax-oss.atlassian.net/browse/JAVA-669>.
>
> On Mon, Dec 19, 2016 at 8:46 PM, Vladimir Yudovin <vla...@winguzone.com>
> wrote:
>
>> Regarding schema agreement - try to increase time between CF creation.
>> Also stress-tool waits for schema, look on its code, probably it uses
>> some methods to ensure schema distribution.
>>
>> Best regards, Vladimir Yudovin,
>> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>>
>>
>> ---- On Mon, 19 Dec 2016 14:35:00 -0500 *Saumitra S
>> <saumitra.srivast...@gmail.com <saumitra.srivast...@gmail.com>>* wrote
>> ----
>>
>> Thanks Vladimir!
>>
>> Is there any known issue in 3.0.10, where creating "CF with large number
>> of cols" or "creating large number of CFs quickly" one after other gives
>> schema agreement issue?
>>
>> What other things can I try to support ~12000 CF without hitting schema
>> agreement related issues? I can put more RAM and increase heap size(even if
>> I need to spend time in GC tuning for such large heap), but the issue which
>> I get with 2400 cols CFs starts happening just after few keyspaces(less
>> than 200 CFs). What can I try to fix that?
>>
>>
>>
>>
>>
>> On Tue, Dec 20, 2016 at 12:53 AM, Vladimir Yudovin <vla...@winguzone.com>
>> wrote:
>>
>>
>> >I want to dig deeper into what all things happen in C* at time of CF
>> creation
>>
>> It starts somewhere in *MigrationManager.announceNewColumnFamily*
>> function, I guess.
>>
>>
>> >imitation of number of keyspaces which can be created.
>>
>> Actually it's CF limitation, not keyspaces.
>>
>>
>> >if you can also point me to the this 1MB per CF thingy, it would be
>> great.
>>
>> Look at http://www.mail-archive.com/user@cassandra.apache.org/msg463
>> 59.html, CASSANDRA-5935, CASSANDRA-2252
>> In source look at *SlabAllocator.REGION_SIZE* definition.
>>
>>
>> Best regards, Vladimir Yudovin,
>> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>>
>>
>>
>> ---- On Mon, 19 Dec 2016 14:10:37 -0500 *Saumitra S
>> <saumitra.srivast...@gmail.com <saumitra.srivast...@gmail.com>>* wrote
>> ----
>>
>> Hi Vladimir,
>>
>> Thanks for the response.
>>
>> When I see *"**com.datastax.driver.core.ControlConnection"* exceptions,
>> I see that keyspaces and CF are created. But when I create CF with large
>> number of columns(2400 cols) quickly one after the other(with 2 seconds gap
>> between CREATE TABLE queries), I get schema agreement timeout errors (* 
>> com.datastax.driver.core.Cluster
>> | Error while waiting for schema agreement). *This happens even with a
>> clean slate(empty data directory), just after creating 4 keyspaces. Timeout
>> is set to 30 seconds. Please note that CREATE TABLE queries are NOT fired
>> in parallel. I wait for 1 query to complete(with schema agreement) before
>> firing another one.
>>
>> I want to dig deeper into what all things happen in C* at time of CF
>> creation to understand more about the limitation of number of keyspaces
>> which can be created. Can you please point me to the corresponding source
>> code? Specifically if you can also point me to the this 1MB per CF thingy,
>> it would be great.
>>
>>
>> Best Regards,
>> Saumitra
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Dec 19, 2016 at 11:41 PM, Vladimir Yudovin <vla...@winguzone.com>
>> wrote:
>>
>>
>> Hi,
>>
>> *Question*: Does C* reads some schema/metadata on calling cqlsh, which
>> is causing timeout with large number of keyspaces?
>>
>> A lot ). cqlsh reads schemas, cluster topology, each node tokens, etc.
>> You can just capture TCP port 9042 (unless you use SSL)  and view all
>> negotiation between cqlsh and node.
>>
>>
>> *Question*: Can a single C* cluster of 5 nodes(32gb/8cpu each) support
>> upto 500 keyspaces each having 25 CFs. What kind of issues I can expect?
>>
>> You have 500*25 = 12500 tables, it's huge number. Each CF takes at least
>> 1M of heap memory. So it needs 12G heap only for starting usage. Make test
>> on one-two node cluster.
>>
>>
>> *Question*: What is the effect of below exception?
>>
>> Is keyspaces created despite exception or no?
>>
>> Best regards, Vladimir Yudovin,
>> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>>
>>
>> ---- On Mon, 19 Dec 2016 10:24:20 -0500 *Saumitra S
>> <saumitra.srivast...@gmail.com <saumitra.srivast...@gmail.com>>* wrote
>> ----
>>
>> Hi All,
>>
>> I have a 2 node cluster(32gb ram/8cpu) running 3.0.10 and I created 50
>> keyspaces in it. Each keyspace has 25 CF. Column count in each CF ranges
>> between 5 to 30.
>>
>> I am getting few issues once keyspace count reaches ~50.
>>
>> *Issue 1:*
>>
>> When I try to use cqlsh, I get timeout.
>>
>> *$ cqlsh `hostname -i`*
>> *Connection error: ('Unable to connect to any servers', {'10.0.20.220':
>> OperationTimedOut('errors=None, last_host=None',)})*
>>
>> If I increase connect timeout, I am able to access cluster through cqlsh
>>
>> *$ cqlsh --connect-timeout 20  `hostname -i   //this works fine*
>>
>> *Question: *Does C* reads some schema/metadata on calling cqlsh, which
>> is causing timeout with large number of keyspaces?
>>
>>
>> *Issue 2:*
>>
>> If I create keyspaces which have 3 large CF(each having around 2500
>> cols), then I start to see schema agreement timeout in my logs. I have set
>> schema agreement timeout to 30 seconds in driver.
>>
>> *2016-12-13 08:37:02.733 | gbd-std-01 | WARN | cluster2-worker-194 |
>> com.datastax.driver.core.Cluster | Error while waiting for schema agreement*
>>
>> *Question:* Can a single C* cluster of 5 nodes(32gb/8cpu each) support
>> upto 500 keyspaces each having 25 CFs. What kind of issues I can expect?
>>
>>
>> *Issue 3:*
>>
>> I am creating keyspaces and CFs through datastax driver. I see following
>> exception in my log after reaching *~50 keyspaces.*
>>
>> *Question: *What is the effect of below exception?
>>
>> 2016-12-19 13:55:35.615 | gbd-std-01 | ERROR | cluster1-worker-147 | 
>> *com.datastax.driver.core.ControlConnection
>> | [Control connection] Unexpected error while refreshing schema*
>> *java.util.concurrent.ExecutionException:
>> com.datastax.driver.core.exceptions.OperationTimedOutException:
>> [gbd-cass-20.ec2-east1.hidden.com/10.0.20.220
>> <http://gbd-cass-20.ec2-east1.hidden.com/10.0.20.220>] Operation timed out*
>>         at com.google.common.util.concurrent.AbstractFuture$Sync.
>> getValue(AbstractFuture.java:299) ~[com.google.guava.guava-18.0.jar:na]
>>         at 
>> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>> ~[com.google.guava.guava-18.0.jar:na]
>>         at 
>> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>> ~[com.google.guava.guava-18.0.jar:na]
>>         at com.datastax.driver.core.SchemaParser.get(SchemaParser.java:467)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.SchemaParser.access$400(SchemaParser.java:30)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at com.datastax.driver.core.SchemaParser$V3SchemaParser.fetchSy
>> stemRows(SchemaParser.java:632) ~[com.datastax.cassandra.cassa
>> ndra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.SchemaParser.refresh(SchemaParser.java:56)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:341)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:306)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at com.datastax.driver.core.Cluster$Manager$SchemaRefreshReques
>> tDeliveryCallback$1.runMayThrow(Cluster.java:2570)
>> [com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.ExceptionCatchingRunnable.run(ExceptionCatchingRunnable.java:32)
>> [com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [na:1.8.0_45]
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> [na:1.8.0_45]
>>         at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_45]
>>         at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_45]
>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
>> Caused by: com.datastax.driver.core.exceptions.OperationTimedOutException:
>> [gbd-cass-20.ec2-east1.hidden.com/10.0.20.220] Operation timed out
>>         at 
>> com.datastax.driver.core.DefaultResultSetFuture.onTimeout(DefaultResultSetFuture.java:209)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.Connection$ResponseHandler$1.run(Connection.java:1260)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581)
>> ~[io.netty.netty-common-4.0.33.Final.jar:4.0.33.Final]
>>         at 
>> io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:655)
>> ~[io.netty.netty-common-4.0.33.Final.jar:4.0.33.Final]
>>         at 
>> io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367)
>> ~[io.netty.netty-common-4.0.33.Final.jar:4.0.33.Final]
>>         ... 1 common frames omitted
>> 2016-12-19 13:55:39.885 | gbd-std-01 | ERROR | cluster2-worker-124 | 
>> *com.datastax.driver.core.ControlConnection
>> | [Control connection] Unexpected error while refreshing schema*
>> *java.util.concurrent.ExecutionException:
>> com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout
>> during read query at consistency ONE (1 responses were required but only 0
>> replica responded)*
>>         at com.google.common.util.concurrent.AbstractFuture$Sync.
>> getValue(AbstractFuture.java:299) ~[com.google.guava.guava-18.0.jar:na]
>>         at 
>> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>> ~[com.google.guava.guava-18.0.jar:na]
>>         at 
>> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>> ~[com.google.guava.guava-18.0.jar:na]
>>         at com.datastax.driver.core.SchemaParser.get(SchemaParser.java:467)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.SchemaParser.access$400(SchemaParser.java:30)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at com.datastax.driver.core.SchemaParser$V3SchemaParser.fetchSy
>> stemRows(SchemaParser.java:632) ~[com.datastax.cassandra.cassa
>> ndra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.SchemaParser.refresh(SchemaParser.java:56)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:341)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:306)
>> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at com.datastax.driver.core.Cluster$Manager$SchemaRefreshReques
>> tDeliveryCallback$1.runMayThrow(Cluster.java:2570)
>> [com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> com.datastax.driver.core.ExceptionCatchingRunnable.run(ExceptionCatchingRunnable.java:32)
>> [com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>>         at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [na:1.8.0_45]
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> [na:1.8.0_45]
>>         at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_45]
>>         at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_45]
>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
>>
>>
>>
>> Best Regards,
>> Saumitra
>>
>>
>>
>>
>>
>

Re: Cqlsh timeout and schema refresh exceptions

Reply via email to