On Tue, Oct 16, 2012 at 11:04 AM, Matt Benson <gudnabr...@gmail.com> wrote:

> Random thoughts--no real context here, so no way to inline:
>
> - "line separator" concept, while harmonizing with the line.separator
> system property, might be better represented as "row separator" so as
> not to imply that the parameter should be in any way limited to \r or
> \n .  I would think the default for this would be the line.separator
> property, however, and thus should take a String or CharSequence
> (perhaps it already does, but there's been so much talk about char
> parameters...).
>

Now that you mention it, this should have been obvious as soon as we wrote
the test cases where a record is split over more than one line.

There is a difference between line number and record number which the API
tracks.

I propose to change "line separator" to "record separator". The default can
be line.separator.

Gary


>
> - with* methods:  just something to think about here, but while we're
> creating a fluent API, would e.g. #delimitedBy('\t') read more
> fluently than #withDelimiter('\t') ?  #escapingWith('\\') vs.
> #withEscape('\\') ?
>
> $0.02,
> Matt
>
> On Tue, Oct 16, 2012 at 8:53 AM, Jörg Schaible
> <joerg.schai...@scalaris.com> wrote:
> > Gary Gregory wrote:
> >
> >> On Tue, Oct 16, 2012 at 9:14 AM, Jörg Schaible
> >> <joerg.schai...@scalaris.com>wrote:
> >>
> >>> Hi Gary,
> >>>
> >>> Gary Gregory wrote:
> >>>
> >>> > Hi All:
> >>> >
> >>> > The format object can configure various aspects of input and output
> >>> > formatting.
> >>> >
> >>> > With my recent addition of the Quote enum for [CSV-53], there are now
> >>> > two aspects of quoting to configure: the quote character and the
> quote
> >>> > policy (minimal, all, non-numeric, and none.) FYI, 'none' is
> currently
> >>> > not implemented.
> >>> >
> >>> > First, I changed (without consulting this list, and please accept my
> >>> > apologies for this) the - IMO - cryptic and burdensome terminology of
> >>> > "encapsulator" to "quote char", and added "quote policy":
> >>> >
> >>> > - withQuoteChar(char)
> >>> > - withQuotePolicy(Quote)
> >>> >
> >>> > My intention here is that all Quote APIs start with "withQuote"
> >>> > followed by what aspect of quoting is being configured.
> >>> >
> >>> > Alternatively, we could have:
> >>> >
> >>> > - withQuote(char)
> >>> > - withQuotePolicy(Quote)
> >>>
> >>> or
> >>>
> >>> - withQuote(char)
> >>> - withQuote(Quote)
> >>>
> >>> ;-)
> >>>
> >>
> >> Darn, I wish I knew you better to know if you were joking! :)
> >>
> >> This would not be good IMO because you are configuring two different
> >> aspects of the behavior. When I see the same API name with different
> >> parameters, I think that they are the same and that the API just does
> >> conversions.
> >>
> >> We could consider making Quote a class instead of an enum and have it
> >> carry a char and an enum, such that one object defines all quoting
> >> aspects. This might be too normalized a design for something so simple
> >> though.
> >
> > Actually I did not had a closer look to the API. You're definitely right
> to
> > use different names for different aspects. It does not make sense to
> > overload just for fun.
> >
> >>
> >>
> >>>
> >>> > Which makes the API more consistent with the other char/Character
> based
> >>> > properties:
> >>> >
> >>> > - withEscape
> >>> > - withDelimiter
> >>> > - withLineSeparator
> >>> > - withCommentStart
> >>> >
> >>> > none of the above are post-fixed with a "Char" in the name.
> >>> >
> >>> > As far as reading, for me, the "-r" names are OK because the they are
> >>> > nouns (things): "a delimiter", "a line separator." But I do not talk
> >>> about
> >>> > "an escape" because that would be an act (think Alcatraz) as opposed
> to
> >>> > what we have here: a character used to /perform/ escapes.
> >>> >
> >>> > So I propose to change "escape" to "escape char" because "escaper" is
> >>> > not a word.
> >>> >
> >>> > The name "comment start" is not great also because it implies (to me)
> >>> that
> >>> > there is a "comment end" missing. So plain "comment" or "comment
> char"
> >>> > would be better.
> >>>
> >>> Who said it has to be a single char?
> >>>
> >>
> >> The current implementation does. ;)
> >>
> >> Are comments even in any RFC?
> >
> > Not that I am aware of.
> >
> >>> .withEOLComment("//")
> >>>
> >>>
> >>> Same applies to the line separator:
> >>>
> >>> .withLineSeparator("\n\r")
> >>>
> >>> > Circling back to "quote char" which I have the way it is now for the
> >>> > same reason as for the "escape" property.
> >>> >
> >>> > In summary, using *Char names is better IMO.
> >>>
> >>> Only if it can be a single char only. If it can either be a single char
> >>> or a
> >>> String, I normally tend to use overloaded methods:
> >>>
> >>> - withEOLComment(char)
> >>> - withEOLComment(CharSequence)
> >>>
> >>
> >> If you want to add // to the mix, please start a different thread. I'm
> not
> >> sure this is really needed. Do you have a real life use case?
> >
> > People come up with all kind of "solutions" they are used to. CSV is
> brittle
> > anyway, just because there is no "real" standard.
> >
> > Cheers,
> > Jörg
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0
Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory

Reply via email to