Hi,

I have test setup where clients randomly make a controlled number of cas()
requests (among other requests) at a cluster of cassandra 2.0 servers.
After one point, I'm seeing that all requests are pending and my client's
throughput has reduced to 0.0 for all kinds of requests. For this specific
case I had 10 clients each making around 30 cas() requests per second at a
cluster of 72 instances of cassandra.

Clients are set up to register a request as a success after the cas() call
returns with CASResult.success = true, else an exception is thrown. Since I
see that no client requests were actually registered and no exceptions were
thrown, which indicates that the cas() call itself is hung.

On the server side, I see Paxos logs as follows - they go on for 50 log
files for each of the servers involved, and they span at least an hour. I
have marked a particular instance where the prepare response is true but
the propose response is false from all the involved servers:

*At the Paxos Initiator: * None of the files among the 50 system logs have
the phrase 'Propose response true', these logs just go on and on.
*
*
DEBUG [RequestResponseStage:110] 2013-07-25 15:09:05,332
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))) from /
17.163.7.195
*
*
DEBUG [RequestResponseStage:92] 2013-07-25 15:09:05,346
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))) from /
17.163.7.184

DEBUG [RequestResponseStage:98] 2013-07-25 15:09:05,347
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))) from /
17.163.7.20

DEBUG [RequestResponseStage:93] 2013-07-25 15:09:05,350
ProposeCallback.java (line 44) Propose response false from /17.163.7.20
DEBUG [RequestResponseStage:100] 2013-07-25 15:09:05,350
ProposeCallback.java (line 44) Propose response false from /17.163.7.184
DEBUG [RequestResponseStage:111] 2013-07-25 15:09:05,350
ProposeCallback.java (line 44) Propose response false from /17.163.7.195

DEBUG [RequestResponseStage:102] 2013-07-25 15:09:05,351
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))) from /
17.163.7.195

DEBUG [RequestResponseStage:107] 2013-07-25 15:09:05,352
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))) from /
17.163.7.20

DEBUG [RequestResponseStage:108] 2013-07-25 15:09:05,352
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))) from /
17.163.7.184

DEBUG [RequestResponseStage:104] 2013-07-25 15:09:05,352
ProposeCallback.java (line 44) Propose response false from /17.163.7.20
DEBUG [RequestResponseStage:99] 2013-07-25 15:09:05,353
ProposeCallback.java (line 44) Propose response false from /17.163.7.195
DEBUG [RequestResponseStage:105] 2013-07-25 15:09:05,353
ProposeCallback.java (line 44) Propose response false from /17.163.7.184

*At 17.163.7.20:*
*
*
DEBUG [MutationStage:58] 2013-07-25 15:09:05,347 PaxosState.java (line 100)
accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20b05e0-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20b7b10-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))

DEBUG [MutationStage:40] 2013-07-25 15:09:05,349 PaxosState.java (line 100)
accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20b7b10-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20bc930-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))

DEBUG [MutationStage:42] 2013-07-25 15:09:05,351 PaxosState.java (line 100)
accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20bc930-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c6570-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))

*DEBUG [MutationStage:43] 2013-07-25 15:09:05,352 PaxosState.java (line
100) accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c6570-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))*

At 17.163.7.195:

DEBUG [MutationStage:33] 2013-07-25 15:09:05,352 PaxosState.java (line 100)
accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20bc930-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c6570-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))

DEBUG [RequestResponseStage:38] 2013-07-25 15:09:05,352
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(658db3eababc5629b8bcb77dc84db81a,
0193c330-f561-11e2-ad78-7bbb6a42087c, ColumnFamily(P
[62caa268bd7e90050000000000000001:false:21@1374780776163000,])),
Commit(658db3eababc5629b8bcb77dc84db81a,
d20c3e60-f576-11e2-b3a3-edb9f71dce8f, ColumnFamily(P
[62caa268bd7e90050000000000000001:false:21@1374780776163000,]))) from /
17.163.7.162

DEBUG [RequestResponseStage:48] 2013-07-25 15:09:05,353
ProposeCallback.java (line 44) Propose response false from /17.163.7.194
DEBUG [RequestResponseStage:54] 2013-07-25 15:09:05,353
ProposeCallback.java (line 44) Propose response false from /17.163.7.162

*DEBUG [MutationStage:35] 2013-07-25 15:09:05,353 PaxosState.java (line
100) accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c6570-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))*

*At 17.163.7.184:*

DEBUG [MutationStage:53] 2013-07-25 15:09:05,347 PaxosState.java (line 100)
accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20b05e0-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20b7b10-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))

DEBUG [MutationStage:38] 2013-07-25 15:09:05,348 PaxosState.java (line 100)
accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20b7b10-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20bc930-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))

DEBUG [MutationStage:64] 2013-07-25 15:09:05,351 PaxosState.java (line 100)
accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20bc930-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c6570-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))

*DEBUG [MutationStage:33] 2013-07-25 15:09:05,351 PaxosState.java (line
100) accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,])) but
inProgress is now Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c6570-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0000000000000003:false:27@1374780817218000,]))*


Thanks,
Soumava

Reply via email to