arloliu opened a new pull request, #1952:
URL: https://github.com/apache/cassandra-gocql-driver/pull/1952

   ## Summary
   
   `Conn.executeQuery` and `Conn.executeBatch` both respond to a 
`*RequestErrUnprepared` by evicting the prepared-statement cache entry and 
recursing on themselves with no upper bound. When the server persistently 
re-reports the same statement as unprepared after re-prepare — for example a 
coordinator thrashing its prepared-statement cache under high statement 
cardinality, or a misbehaving proxy/fork — the recursion never terminates. The 
goroutine stack eventually exceeds `runtime.SetMaxStack` (1 GiB by default), at 
which point Go's runtime crashes the **entire process** with an unrecoverable 
stack-overflow throw. No `recover()` can intercept it.
   
   The failure mode is reachable from any prepared-statement workload; the 
batch path is on the hot write path. No attacker is required — organic 
Cassandra prep-cache thrash is sufficient.
   
   This PR caps re-prepare retries on both paths at 5 and surfaces a 
descriptive error wrapping the underlying `*RequestErrUnprepared` so 
`errors.As` continues to work.
   
   ## Approach
   
   - New unexported helpers `executeQueryWithUnprepRetries` / 
`executeBatchWithUnprepRetries` take an `unprepAttempt int` and increment it on 
each retry.
   - Existing entry points (`executeQuery`, `executeBatch`) become thin 
wrappers that start the counter at 0.
   - When the counter reaches `maxUnprepRetries = 5`, the iter is returned with 
`err = fmt.Errorf("…after N re-prepare attempts: %w", serverErr)`.
   - Behavior is a strict superset of the prior code: queries and batches that 
succeed within 5 attempts behave exactly as before. Only the pathological 
no-progress case is changed.
   
   ## Tests
   
   Test fake-server (`conn_test.go`) gains:
   - an `always-unprep` `opPrepare` case returning `id=99`,
   - an `opBatch` arm that replies `ErrCodeUnprepared` when any statement 
carries `id=99`,
   - case-insensitive verb-trimming in the `opPrepare` query-name parser 
(`select|insert|update|delete`) so DML in batches can reach the `always-unprep` 
case.
   
   `unprep_retry_test.go` (new):
   - `TestExecuteQuery_UnprepRetryIsCapped` drives the always-unprep path 
through `Query.Exec`.
   - `TestExecuteBatch_UnprepRetryIsCapped` drives it through 
`Session.ExecuteBatch`.
   - Each asserts: no infinite recursion, error mentions `"re-prepare 
attempts"`, `errors.As(err, &*RequestErrUnprepared)` succeeds, and the server 
received exactly `maxUnprepRetries + 1` prepare/execute (resp. prepare/batch) 
pairs.
   
   ## Test plan
   
   - [x] `make check` clean
   - [x] `make test-unit` green (both new tests pass)
   - [ ] CI on GitHub Actions
   
   ## Notes
   
   No CASSGO ticket attached; happy to file one and amend the commit / 
CHANGELOG entry if the committer prefers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to