On Tue, Oct 16, 2012 at 11:04 AM, Matt Benson <gudnabr...@gmail.com> wrote:
> Random thoughts--no real context here, so no way to inline: > > - "line separator" concept, while harmonizing with the line.separator > system property, might be better represented as "row separator" so as > not to imply that the parameter should be in any way limited to \r or > \n . I would think the default for this would be the line.separator > property, however, and thus should take a String or CharSequence > (perhaps it already does, but there's been so much talk about char > parameters...). > Now that you mention it, this should have been obvious as soon as we wrote the test cases where a record is split over more than one line. There is a difference between line number and record number which the API tracks. I propose to change "line separator" to "record separator". The default can be line.separator. Gary > > - with* methods: just something to think about here, but while we're > creating a fluent API, would e.g. #delimitedBy('\t') read more > fluently than #withDelimiter('\t') ? #escapingWith('\\') vs. > #withEscape('\\') ? > > $0.02, > Matt > > On Tue, Oct 16, 2012 at 8:53 AM, Jörg Schaible > <joerg.schai...@scalaris.com> wrote: > > Gary Gregory wrote: > > > >> On Tue, Oct 16, 2012 at 9:14 AM, Jörg Schaible > >> <joerg.schai...@scalaris.com>wrote: > >> > >>> Hi Gary, > >>> > >>> Gary Gregory wrote: > >>> > >>> > Hi All: > >>> > > >>> > The format object can configure various aspects of input and output > >>> > formatting. > >>> > > >>> > With my recent addition of the Quote enum for [CSV-53], there are now > >>> > two aspects of quoting to configure: the quote character and the > quote > >>> > policy (minimal, all, non-numeric, and none.) FYI, 'none' is > currently > >>> > not implemented. > >>> > > >>> > First, I changed (without consulting this list, and please accept my > >>> > apologies for this) the - IMO - cryptic and burdensome terminology of > >>> > "encapsulator" to "quote char", and added "quote policy": > >>> > > >>> > - withQuoteChar(char) > >>> > - withQuotePolicy(Quote) > >>> > > >>> > My intention here is that all Quote APIs start with "withQuote" > >>> > followed by what aspect of quoting is being configured. > >>> > > >>> > Alternatively, we could have: > >>> > > >>> > - withQuote(char) > >>> > - withQuotePolicy(Quote) > >>> > >>> or > >>> > >>> - withQuote(char) > >>> - withQuote(Quote) > >>> > >>> ;-) > >>> > >> > >> Darn, I wish I knew you better to know if you were joking! :) > >> > >> This would not be good IMO because you are configuring two different > >> aspects of the behavior. When I see the same API name with different > >> parameters, I think that they are the same and that the API just does > >> conversions. > >> > >> We could consider making Quote a class instead of an enum and have it > >> carry a char and an enum, such that one object defines all quoting > >> aspects. This might be too normalized a design for something so simple > >> though. > > > > Actually I did not had a closer look to the API. You're definitely right > to > > use different names for different aspects. It does not make sense to > > overload just for fun. > > > >> > >> > >>> > >>> > Which makes the API more consistent with the other char/Character > based > >>> > properties: > >>> > > >>> > - withEscape > >>> > - withDelimiter > >>> > - withLineSeparator > >>> > - withCommentStart > >>> > > >>> > none of the above are post-fixed with a "Char" in the name. > >>> > > >>> > As far as reading, for me, the "-r" names are OK because the they are > >>> > nouns (things): "a delimiter", "a line separator." But I do not talk > >>> about > >>> > "an escape" because that would be an act (think Alcatraz) as opposed > to > >>> > what we have here: a character used to /perform/ escapes. > >>> > > >>> > So I propose to change "escape" to "escape char" because "escaper" is > >>> > not a word. > >>> > > >>> > The name "comment start" is not great also because it implies (to me) > >>> that > >>> > there is a "comment end" missing. So plain "comment" or "comment > char" > >>> > would be better. > >>> > >>> Who said it has to be a single char? > >>> > >> > >> The current implementation does. ;) > >> > >> Are comments even in any RFC? > > > > Not that I am aware of. > > > >>> .withEOLComment("//") > >>> > >>> > >>> Same applies to the line separator: > >>> > >>> .withLineSeparator("\n\r") > >>> > >>> > Circling back to "quote char" which I have the way it is now for the > >>> > same reason as for the "escape" property. > >>> > > >>> > In summary, using *Char names is better IMO. > >>> > >>> Only if it can be a single char only. If it can either be a single char > >>> or a > >>> String, I normally tend to use overloaded methods: > >>> > >>> - withEOLComment(char) > >>> - withEOLComment(CharSequence) > >>> > >> > >> If you want to add // to the mix, please start a different thread. I'm > not > >> sure this is really needed. Do you have a real life use case? > > > > People come up with all kind of "solutions" they are used to. CSV is > brittle > > anyway, just because there is no "real" standard. > > > > Cheers, > > Jörg > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- E-Mail: garydgreg...@gmail.com | ggreg...@apache.org JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0 Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK Blog: http://garygregory.wordpress.com Home: http://garygregory.com/ Tweet! http://twitter.com/GaryGregory