On Tue, Oct 16, 2012 at 11:43 AM, sebb <seb...@gmail.com> wrote: > On 16 October 2012 16:34, Gary Gregory <garydgreg...@gmail.com> wrote: > > On Tue, Oct 16, 2012 at 11:29 AM, Gary Gregory <garydgreg...@gmail.com > >wrote: > > > >> On Tue, Oct 16, 2012 at 11:04 AM, Matt Benson <gudnabr...@gmail.com > >wrote: > >> > >>> Random thoughts--no real context here, so no way to inline: > >>> > >>> - "line separator" concept, while harmonizing with the line.separator > >>> system property, might be better represented as "row separator" so as > >>> not to imply that the parameter should be in any way limited to \r or > >>> \n . I would think the default for this would be the line.separator > >>> property, however, and thus should take a String or CharSequence > >>> (perhaps it already does, but there's been so much talk about char > >>> parameters...). > >>> > >> > >> Now that you mention it, this should have been obvious as soon as we > wrote > >> the test cases where a record is split over more than one line. > >> > >> There is a difference between line number and record number which the > API > >> tracks. > >> > >> I propose to change "line separator" to "record separator".
Folks seem to like this one, it is now in SVN. > The default > >> can be line.separator. > I did not do this one as is it seems RFC4180 defines CR+LF as the record separator as noted in the Javadoc for org.apache.commons.csv.CSVFormat.DEFAULT. Gary > >> > >> Gary > >> > >> > >>> > >>> - with* methods: just something to think about here, but while we're > >>> creating a fluent API, would e.g. #delimitedBy('\t') read more > >>> fluently than #withDelimiter('\t') ? #escapingWith('\\') vs. > >>> #withEscape('\\') ? > >>> > >> > > I find that the combination of the fluent API style AND immutability of > the > > format class ugly because of the PRISTINE & DISABLED internal crud. > > > > Why not just have DEFAULT and dump PRISTINE? Other formats should be > based > > on DEFAULT. > > No, because DEFAULT includes several settings that may not be required. > > > With PRISTINE, the door is open for a future format to not override > > DISABLED and create a bug, as unlikely as it is. > > With DEFAULT, the door is *already* open for bugs due to failure to > reset the unwanted settings. > > It's not possible currently to create an instance without overriding > the DISABLED delimiter. > > > Gary > > > > > > > > > >>> $0.02, > >>> Matt > >>> > >>> On Tue, Oct 16, 2012 at 8:53 AM, Jörg Schaible > >>> <joerg.schai...@scalaris.com> wrote: > >>> > Gary Gregory wrote: > >>> > > >>> >> On Tue, Oct 16, 2012 at 9:14 AM, Jörg Schaible > >>> >> <joerg.schai...@scalaris.com>wrote: > >>> >> > >>> >>> Hi Gary, > >>> >>> > >>> >>> Gary Gregory wrote: > >>> >>> > >>> >>> > Hi All: > >>> >>> > > >>> >>> > The format object can configure various aspects of input and > output > >>> >>> > formatting. > >>> >>> > > >>> >>> > With my recent addition of the Quote enum for [CSV-53], there are > >>> now > >>> >>> > two aspects of quoting to configure: the quote character and the > >>> quote > >>> >>> > policy (minimal, all, non-numeric, and none.) FYI, 'none' is > >>> currently > >>> >>> > not implemented. > >>> >>> > > >>> >>> > First, I changed (without consulting this list, and please > accept my > >>> >>> > apologies for this) the - IMO - cryptic and burdensome > terminology > >>> of > >>> >>> > "encapsulator" to "quote char", and added "quote policy": > >>> >>> > > >>> >>> > - withQuoteChar(char) > >>> >>> > - withQuotePolicy(Quote) > >>> >>> > > >>> >>> > My intention here is that all Quote APIs start with "withQuote" > >>> >>> > followed by what aspect of quoting is being configured. > >>> >>> > > >>> >>> > Alternatively, we could have: > >>> >>> > > >>> >>> > - withQuote(char) > >>> >>> > - withQuotePolicy(Quote) > >>> >>> > >>> >>> or > >>> >>> > >>> >>> - withQuote(char) > >>> >>> - withQuote(Quote) > >>> >>> > >>> >>> ;-) > >>> >>> > >>> >> > >>> >> Darn, I wish I knew you better to know if you were joking! :) > >>> >> > >>> >> This would not be good IMO because you are configuring two different > >>> >> aspects of the behavior. When I see the same API name with different > >>> >> parameters, I think that they are the same and that the API just > does > >>> >> conversions. > >>> >> > >>> >> We could consider making Quote a class instead of an enum and have > it > >>> >> carry a char and an enum, such that one object defines all quoting > >>> >> aspects. This might be too normalized a design for something so > simple > >>> >> though. > >>> > > >>> > Actually I did not had a closer look to the API. You're definitely > >>> right to > >>> > use different names for different aspects. It does not make sense to > >>> > overload just for fun. > >>> > > >>> >> > >>> >> > >>> >>> > >>> >>> > Which makes the API more consistent with the other char/Character > >>> based > >>> >>> > properties: > >>> >>> > > >>> >>> > - withEscape > >>> >>> > - withDelimiter > >>> >>> > - withLineSeparator > >>> >>> > - withCommentStart > >>> >>> > > >>> >>> > none of the above are post-fixed with a "Char" in the name. > >>> >>> > > >>> >>> > As far as reading, for me, the "-r" names are OK because the they > >>> are > >>> >>> > nouns (things): "a delimiter", "a line separator." But I do not > talk > >>> >>> about > >>> >>> > "an escape" because that would be an act (think Alcatraz) as > >>> opposed to > >>> >>> > what we have here: a character used to /perform/ escapes. > >>> >>> > > >>> >>> > So I propose to change "escape" to "escape char" because > "escaper" > >>> is > >>> >>> > not a word. > >>> >>> > > >>> >>> > The name "comment start" is not great also because it implies (to > >>> me) > >>> >>> that > >>> >>> > there is a "comment end" missing. So plain "comment" or "comment > >>> char" > >>> >>> > would be better. > >>> >>> > >>> >>> Who said it has to be a single char? > >>> >>> > >>> >> > >>> >> The current implementation does. ;) > >>> >> > >>> >> Are comments even in any RFC? > >>> > > >>> > Not that I am aware of. > >>> > > >>> >>> .withEOLComment("//") > >>> >>> > >>> >>> > >>> >>> Same applies to the line separator: > >>> >>> > >>> >>> .withLineSeparator("\n\r") > >>> >>> > >>> >>> > Circling back to "quote char" which I have the way it is now for > the > >>> >>> > same reason as for the "escape" property. > >>> >>> > > >>> >>> > In summary, using *Char names is better IMO. > >>> >>> > >>> >>> Only if it can be a single char only. If it can either be a single > >>> char > >>> >>> or a > >>> >>> String, I normally tend to use overloaded methods: > >>> >>> > >>> >>> - withEOLComment(char) > >>> >>> - withEOLComment(CharSequence) > >>> >>> > >>> >> > >>> >> If you want to add // to the mix, please start a different thread. > I'm > >>> not > >>> >> sure this is really needed. Do you have a real life use case? > >>> > > >>> > People come up with all kind of "solutions" they are used to. CSV is > >>> brittle > >>> > anyway, just because there is no "real" standard. > >>> > > >>> > Cheers, > >>> > Jörg > >>> > > >>> > > >>> > --------------------------------------------------------------------- > >>> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > >>> > For additional commands, e-mail: dev-h...@commons.apache.org > >>> > > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > >>> For additional commands, e-mail: dev-h...@commons.apache.org > >>> > >>> > >> > >> > >> -- > >> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org > >> JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0 > >> Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK > >> Blog: http://garygregory.wordpress.com > >> Home: http://garygregory.com/ > >> Tweet! http://twitter.com/GaryGregory > >> > > > > > > > > -- > > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org > > JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0 > > Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK > > Blog: http://garygregory.wordpress.com > > Home: http://garygregory.com/ > > Tweet! http://twitter.com/GaryGregory > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- E-Mail: garydgreg...@gmail.com | ggreg...@apache.org JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0 Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK Blog: http://garygregory.wordpress.com Home: http://garygregory.com/ Tweet! http://twitter.com/GaryGregory