Hi, You can enable logging at driver to see what's happening under the hood: https://docs.datastax.com/en/developer/csharp-driver/3.14/faq/#how-can-i-enable-logging-in-the-driver With logging information, it should be easy to track the issue down.
Can you query system.local and system.peers on a seed node / contact point to see if all the node list / token info is expected. You can compare it to nodetool ring info. Not directly related: 256 vnodes is probably more than you want. Thanks, Jorge On Thu, Apr 30, 2020 at 9:48 AM Gediminas Blazys <gediminas.bla...@microsoft.com.invalid> wrote: > Hello, > > > > We have run into a very interesting issue and maybe some of you have > encountered it or just have an idea where to look. > > > > We are working towards adding new dcs into our cluster, here's the current > topology: > > DC1 - 18 nodes > > DC2 - 18 nodes > > DC3 - 18 nodes > > DC4 - 18 nodes > > DC5 - 18 nodes > > > > Recently we introduced a new DC6 (60 nodes) into our cluster. The joining > and rebuilding of DC6 went smoothly, clients are using it without issue. > This is how it looked after joining DC6: > > DC1 - 18 nodes > > DC2 - 18 nodes > > DC3 - 18 nodes > > DC4 - 18 nodes > > DC5 - 18 nodes > > DC6 - 60 nodes > > > > Next we wanted to add another DC7 (also 60 nodes) making it a total of 210 > nodes in the cluster, and while joining new nodes went smoothly, once we > changed the replication of user defined keyspaces to include DC7, no > clients were able to connect to Cassandra (regardless of which DC is being > addressed). They would throw an exception that I have provided at the end > of the email. > > > > Cassandra version 3.11.4. > > C# driver version 3.12.0. Also tested with 3.14.0. We use dc round robin > policy and update ring metadata for connecting clients. > > Amount of vnodes per node: 256 > > > > The stack trace starts with an exception 'The source argument contains > duplicate keys.'. Maybe you know what kind of data is in this dictionary? > What data can be duplicated here? > > > > Clients are unable to connect until the moment we remove DC7 from > replication. Once replication is adjusted to exclude DC7, clients can > connect normally. > > > > Cassandra.NoHostAvailableException: All hosts tried for query failed > (tried <<IPaddress>>:9042: ArgumentException 'The source argument contains > duplicate keys.')2020/04/29 10:19:27.51410636 > > at > Cassandra.Connections.ControlConnection.<Connect>d__39.MoveNext()2020/04/29 > 10:19:27.51410636 > > --- End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Connections.ControlConnection.<InitAsync>d__36.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Tasks.TaskHelper.<WaitToCompleteAsync>d__10.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Cluster.<Cassandra-SessionManagement-IInternalCluster-OnInitializeAsync>d__50.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.ClusterLifecycleManager.<InitializeAsync>d__3.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Cluster.<Cassandra-SessionManagement-IInternalCluster-ConnectAsync>d__47`1.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Cluster.<ConnectAsync>d__46.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > Cassandra.Tasks.TaskHelper.WaitToComplete(Task task, Int32 > timeout)2020/04/29 10:19:27.51410636 > > Cassandra.Cluster.Connect()2020/04/29 10:19:27.51410636 > > > > We would really appreciate your input, big thanks in advance. > > > > Gediminas > > >