Re: how to read parent_repair_history table?

Paulo Motta Thu, 25 Feb 2016 10:37:52 -0800

> why each job repair execution will have 2 entries? I thought it will be
one entry, begining with started_at column filled, and when it completed,
finished_at column will be filled.


that's correct, I was mistaken!

> Also, if my cluster has more than 1 keyspace, and the way this table is
structured, it will have multiple entries, one for each keysapce_name
value. no ? thanks

right, because repair sessions in different keyspaces will have different
repair session ids.

2016-02-25 15:04 GMT-03:00 Jimmy Lin <y2k...@gmail.com>:

> hi Paulo,
>
> follow up on the # of entries question...
>
>  why each job repair execution will have 2 entries?
> I thought it will be one entry, begining with started_at column filled, and 
> when it completed, finished_at column will be filled.
>
> Also, if my cluster has more than 1 keyspace, and the way this table is 
> structured, it will have multiple entries, one for each keysapce_name value. 
> no ?
>
> thanks
>
>
>
> Sent from my iPhone
>
> On Feb 25, 2016, at 5:48 AM, Paulo Motta <pauloricard...@gmail.com> wrote:
>
> Hello Jimmy,
>
> The parent_repair_history table keeps track of start and finish
> information of a repair session.  The other table repair_history keeps
> track of repair status as it progresses. So, you must first query the
> parent_repair_history table to check if a repair started and finish, as
> well as its duration, and inspect the repair_history table to troubleshoot
> more specific details of a given repair session.
>
> Answering your questions below:
>
> > Is every invocation of nodetool repair execution will be recorded as one
> entry in parent_repair_history CF regardless if it is across DC, local node
> repair, or other options ?
>
> Actually two entries, one for start and one for finish.
>
> > A repair job is done only if "finished" column contains value? and a
> repair job is successfully done only if there is no value in exce
> ption_messages or exception_stacktrace ?
>
> correct
>
> > what is the purpose of successful_ranges column? do i have to check they
> are all matched with requested_range to ensure a successful run?
>
> correct
>
> -
> > Ultimately, how to find out the overall repair health/status in a given
> cluster?
>
> Check if repair is being executed on all nodes within gc_grace_seconds,
> and tune that value or troubleshoot problems otherwise.
>
> > Scanning through parent_repair_history and making sure all the known
> keyspaces has a good repair run in recent days?
>
> Sounds good.
>
> You can check https://issues.apache.org/jira/browse/CASSANDRA-5839 for
> more information.
>
>
> 2016-02-25 3:13 GMT-03:00 Jimmy Lin <y2klyf+w...@gmail.com>:
>
>>
>> hi all,
>> few questions regarding how to read or digest the
>> system_distributed.parent_repair_history CF, that I am very intereted to
>> use to find out our repair status...
>>
>> -
>> Is every invocation of nodetool repair execution will be recorded as one
>> entry in parent_repair_history CF regardless if it is across DC, local node
>> repair, or other options ?
>>
>> -
>> A repair job is done only if "finished" column contains value? and a
>> repair job is successfully done only if there is no value in exce
>> ption_messages or exception_stacktrace ?
>> what is the purpose of successful_ranges column? do i have to check they
>> are all matched with requested_range to ensure a successful run?
>>
>> -
>> Ultimately, how to find out the overall repair health/status in a given
>> cluster?
>> Scanning through parent_repair_history and making sure all the known
>> keyspaces has a good repair run in recent days?
>>
>> ---------------
>> CREATE TABLE system_distributed.parent_repair_history (
>>     parent_id timeuuid PRIMARY KEY,
>>     columnfamily_names set<text>,
>>     exception_message text,
>>     exception_stacktrace text,
>>     finished_at timestamp,
>>     keyspace_name text,
>>     requested_ranges set<text>,
>>     started_at timestamp,
>>     successful_ranges set<text>
>> )
>>
>
>

Re: how to read parent_repair_history table?

Reply via email to