> why each job repair execution will have 2 entries? I thought it will be one entry, begining with started_at column filled, and when it completed, finished_at column will be filled.
that's correct, I was mistaken! > Also, if my cluster has more than 1 keyspace, and the way this table is structured, it will have multiple entries, one for each keysapce_name value. no ? thanks right, because repair sessions in different keyspaces will have different repair session ids. 2016-02-25 15:04 GMT-03:00 Jimmy Lin <y2k...@gmail.com>: > hi Paulo, > > follow up on the # of entries question... > > why each job repair execution will have 2 entries? > I thought it will be one entry, begining with started_at column filled, and > when it completed, finished_at column will be filled. > > Also, if my cluster has more than 1 keyspace, and the way this table is > structured, it will have multiple entries, one for each keysapce_name value. > no ? > > thanks > > > > Sent from my iPhone > > On Feb 25, 2016, at 5:48 AM, Paulo Motta <pauloricard...@gmail.com> wrote: > > Hello Jimmy, > > The parent_repair_history table keeps track of start and finish > information of a repair session. The other table repair_history keeps > track of repair status as it progresses. So, you must first query the > parent_repair_history table to check if a repair started and finish, as > well as its duration, and inspect the repair_history table to troubleshoot > more specific details of a given repair session. > > Answering your questions below: > > > Is every invocation of nodetool repair execution will be recorded as one > entry in parent_repair_history CF regardless if it is across DC, local node > repair, or other options ? > > Actually two entries, one for start and one for finish. > > > A repair job is done only if "finished" column contains value? and a > repair job is successfully done only if there is no value in exce > ption_messages or exception_stacktrace ? > > correct > > > what is the purpose of successful_ranges column? do i have to check they > are all matched with requested_range to ensure a successful run? > > correct > > - > > Ultimately, how to find out the overall repair health/status in a given > cluster? > > Check if repair is being executed on all nodes within gc_grace_seconds, > and tune that value or troubleshoot problems otherwise. > > > Scanning through parent_repair_history and making sure all the known > keyspaces has a good repair run in recent days? > > Sounds good. > > You can check https://issues.apache.org/jira/browse/CASSANDRA-5839 for > more information. > > > 2016-02-25 3:13 GMT-03:00 Jimmy Lin <y2klyf+w...@gmail.com>: > >> >> hi all, >> few questions regarding how to read or digest the >> system_distributed.parent_repair_history CF, that I am very intereted to >> use to find out our repair status... >> >> - >> Is every invocation of nodetool repair execution will be recorded as one >> entry in parent_repair_history CF regardless if it is across DC, local node >> repair, or other options ? >> >> - >> A repair job is done only if "finished" column contains value? and a >> repair job is successfully done only if there is no value in exce >> ption_messages or exception_stacktrace ? >> what is the purpose of successful_ranges column? do i have to check they >> are all matched with requested_range to ensure a successful run? >> >> - >> Ultimately, how to find out the overall repair health/status in a given >> cluster? >> Scanning through parent_repair_history and making sure all the known >> keyspaces has a good repair run in recent days? >> >> --------------- >> CREATE TABLE system_distributed.parent_repair_history ( >> parent_id timeuuid PRIMARY KEY, >> columnfamily_names set<text>, >> exception_message text, >> exception_stacktrace text, >> finished_at timestamp, >> keyspace_name text, >> requested_ranges set<text>, >> started_at timestamp, >> successful_ranges set<text> >> ) >> > >