Hi Sven, my use case was hand-edited CSVs (therefore the quotes are extra clutter), which imples that I would be hand-viewing/editing only small CSVs (no performance issues).
I agree that we should err on the safe side with CR & LF (e.g. tools may sometimes autoconvert line endings). Regarding performance: #findFirstInString:inSet:startingAt: didn't work for me (not sure if bug, or I don't know how to use), so I've trried with inCharacterSet: Tested on worst-case scenario - strings don't contain tested symbols. s10 := 'a' repeat: 10. s100 := 'a' repeat: 100. [ {'"'. ','. String cr. String lf } anySatisfy: [ :each | s10 includesSubstring: each ] ] bench. "'1,200,046 per second'" [ {'"'. ','. String cr. String lf } anySatisfy: [ :each | s100 includesSubstring: each ] ] bench. "'495,482 per second'" [ s10 includesAny: { $,. $". Character cr. Character lf } ] bench. "'2,819,416 per second'" [ s100 includesAny: { $,. $". Character cr. Character lf } ] bench. "'2,200,668 per second'" [ ByteString findFirstInString: s10 inCharacterSet: ',"', String crlf startingAt: 1 ] bench. "'1,187,324 per second'" [ ByteString findFirstInString: s100 inCharacterSet: ',"', String crlf startingAt: 1 ] bench. "'165,526 per second'" #includesAny: seems to be the best by far. Storing the tested characters didn't improve it by much. Peter On Sat, Jul 22, 2017 at 06:51:31PM +0200, Sven Van Caekenberghe wrote: > Hi Peter, > > > On 22 Jul 2017, at 14:12, Peter Uhnak <i.uh...@gmail.com> wrote: > > > > Hi, > > > > this is a continuation of an older thread about quoting fields only when > > necessary. ( > > http://forum.world.st/NeoCSVWriter-automatic-quotes-td4924781.html ) > > > > I've attached fileouts of NeoCSV packages with the addition (I don't know > > if Metacello can file-out only changesets). > > > > The change doesn't affect the default behavior, only when explicitly > > requested. > > > > Peter > > I accepted your changes as such, the .mcz's were copied over to the main > repositories. This is a pure & clean extension, so that is good. Thank you. > > This option is always going to be slower, but the current implementation > might be improved, I think. > > The key test in #writeOptionalQuotedField: > > { > lineEnd asString. > separator asString. > '"' } anySatisfy: [ :each | string includesSubstring: each ] > > will be quite slow. > > If we accept a little bit of (over safe) error on EOL and use any occurrence > of CR or LF as needing a quote, we could switch to characters to test for. > There exists a fast (primitive) test, #findFirstInString:inSet:startingAt: > that can do all the testing in one go. If your version turns out to be slow, > we could try that, if measurements show a difference. > > Regards, > > Sven > >