Re: fsck output compatibility question with regard to HDFS-7281

2015-05-05 Thread Colin McCabe
How about just having a --json option for the fsck command? That's what we did in Ceph for some command line tools. It would make the output easier to consume and easier to provide compatibility for. Colin On Apr 28, 2015 12:32 PM, "Allen Wittenauer" wrote: > > A lot of the summary information

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-28 Thread Allen Wittenauer
A lot of the summary information… but the key parts of “yo, these files are busted and here’s why” is not, IIRC. That’s one of the key items where people are parsing fsck output (and worse, usually under duress.) On Apr 28, 2015, at 12:23 PM, Mai Haohui wrote: > In terms of the monitoring, w

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-28 Thread Mai Haohui
In terms of the monitoring, we have put a lot of information into the JMX output. It's relatively easy to use python / ruby / node.js to write your own tools to parse the information. In the longer term, it might also make sense to move some of our tools to based on the JMX output instead of makin

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-28 Thread Andrew Wang
On Tue, Apr 28, 2015 at 11:25 AM, Allen Wittenauer wrote: > > On Apr 28, 2015, at 10:59 AM, Andrew Wang > wrote: > > > > This is also not something typically upheld by unix-y commands. BSD vs. > GNU > > already leads to incompatible flags and output. Most of these commands > > haven't been chang

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-28 Thread Andrew Wang
Hi Rich, You're essentially proposing feature flags. We've discussed this before wrt namenode metadata, and the complexity from supporting the combinations of all the flags is substantial. It's even harder in the domain of shell output, since we don't even have a standard way of parsing. Also ima

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-28 Thread Allen Wittenauer
On Apr 28, 2015, at 10:59 AM, Andrew Wang wrote: > > This is also not something typically upheld by unix-y commands. BSD vs. GNU > already leads to incompatible flags and output. Most of these commands > haven't been changed in 20 years, but that doesn't constitute a compat > guarantee.

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-28 Thread Rich Haase
I'm late to the discussion so I apologize if this has already been suggested. Can't we just add new options flags to include new cli output? Seems like that would work regardless of the cli being changed. If compatibility is broken it could be done as part of a major release. Eg. Add a

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-28 Thread Andrew Wang
I'm surprised by this compatibility requirement. It's quite onerous, since it means we can't evolve the output at all. There's no standardized way to parse CLI output, so who knows what might break user scripts. e.g. if we wanted to display a "+" for ACLs in ls output, that'd be incompatible. Same

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-24 Thread Yongjun Zhang
Thanks Chris, good clarification! --Yongjun On Fri, Apr 24, 2015 at 12:36 PM, Chris Nauroth wrote: > Metrics/JMX is covered by our compatibility guidelines: > > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/Comp > atibility.html#MetricsJMX > > > Metrics/JMX is similar t

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-24 Thread Chris Nauroth
Metrics/JMX is covered by our compatibility guidelines: http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/Comp atibility.html#MetricsJMX Metrics/JMX is similar to our usage of Protocol Buffers/JSON that I mentioned. It supports backwards-compatible evolution if the change i

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-24 Thread Yongjun Zhang
Thanks Allen and Chris! What about adding new entries to jmx report? Somehow I had the impression that if we add new entries to it, it's not considered incompatible. Often within the same minor release, we want to add new info to jmx report instead of waiting for a major release. For CLI like fsc

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-24 Thread Chris Nauroth
Allen, thank you for calling this out. I was not aware of this part of the compatibility guidelines. I committed one of those fsck changes in HDFS-7933. I see you flagged the issue as incompatible, which agrees with the compatibility guidelines. "Changing the path of a command, removing or rena

Re: fsck output compatibility question with regard to HDFS-7281

2015-04-24 Thread Allen Wittenauer
On Apr 24, 2015, at 5:53 AM, Yongjun Zhang wrote: > > Basically we are adding two additional lines to the report (as highlighted > above). > > Theoretically if a tool parses existing fsck report and expects the > 'Corrupt blocks" entry to be right after the "Average block replication" > entry,

fsck output compatibility question with regard to HDFS-7281

2015-04-23 Thread Yongjun Zhang
Hi, For HDFS-7281, we are making a change in fsck report: Before the change: CORRUPT FILES:29 MISSING BLOCKS: 29 MISSING SIZE: 576920501 B CORRUPT BLOCKS: 29 ... Default replication factor:3 Average block replication: 2.7412367 Corrupt blocks: