[ https://issues.apache.org/jira/browse/CASSANDRA-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310164#comment-14310164 ]
Tyler Hobbs commented on CASSANDRA-8715: ---------------------------------------- It's not deadlock. The problem is that a handful of the requests never complete (usually three or four, in my testing). I verified this by dumping {{session.get_pool_state()}}, which shows the number of in-flight requests per connection. These counts were consistent with the number of pending chains (on the countdown latch). Since the python driver doesn't currently have a way to timeout execute_async() + callback requests, these hang indefinitely. I also observed that with {{async_insert.CONCURRENCY}} set to 1, there were no problems. This leads me to believe that there may be some sort of concurrency problem with kerberos, either in the driver, the python kerberos library, or in DSE/Cassandra. It's worth noting that CASSANDRA-8225 will switch COPY FROM to a single-thread-per-process model, which might consequently avoid this bug. Also, regarding the double-statement issue, I'm not sure what might be causing that in your environment, but I haven't been able to reproduce it (with or without kerberos). > Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using "COPY > ... FROM ..." > ------------------------------------------------------------------------------------------- > > Key: CASSANDRA-8715 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8715 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 2.1.2.160, cqlsh 5.0.1, Native protocol v3 > Reporter: Eduard Tudenhoefner > Assignee: Tyler Hobbs > Priority: Minor > Labels: cqlsh > Fix For: 2.1.3 > > > When running a COPY ... FROM ... command in a Kerberos environment, I see the > number of rows processed, but eventually, Cqlsh never returns. I can verify, > that all the data was copied, but the progress bar shows me the last shown > info and cqlsh hangs there and never returns. > Please note that this issue did *not* occur in the exact same environment > with *Cassandra 2.0.12.156*. > With the help of Tyler Hobbs, I investigated the problem a little bit further > and added some debug statements at specific points. For example, in the > CountdownLatch class at > https://github.com/apache/cassandra/blob/a323a1a6d5f28ced1a51ba559055283f3eb356ff/pylib/cqlshlib/async_insert.py#L35-L36 > I can see that the counter always stays above zero and therefore never > returns (even when the data to be copied is already copied). > I've also seen that somehow when I type in one cqlsh command, there will be > actually two commands. Let me give you an example: > I added a debug statement just before > https://github.com/apache/cassandra/blob/d76450c7986202141f3a917b3623a4c3138c1094/bin/cqlsh#L920 > {code} > cqlsh> use libdata ; > 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), > ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] > 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), > ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] > {code} > and saw that all commands I enter, they end up being executed twice (same > goes for the COPY command). > If I can provide any other input for debugging purposes, please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)