>   102. When there is a correlated hard failure (e.g., power outage), it's
>>   possible that an existing commit/abort marker is lost in all replicas.
>>  This  may not be fixed by the transaction coordinator automatically and
>> the
>>   consumer may get stuck on that incomplete transaction forever. Not sure
>>   what's the best way to address this. Perhaps, one way is to run a tool
>> to
>>   add an abort maker for all pids in all affected partitions.
>
>

There can be two types of tools, one for diagnosing the issue and another
> for fixing the issue. I think having at least a diagnostic tool in the
> first version could be helpful. For example, the tool can report things
> like which producer id is preventing the LSO from being advanced. That way,
> at least the users can try to fix this themselves.
>


That sounds reasonable. Will add a work item to track this so that such a
tool is available in the first version.

Reply via email to