Thanks for this Alex! For list members, I am indebted to Alex for his original csv parsing code which I used, with his permission, in my SQLiteAdmin application.
I will check out this code and see how it compares to the code currently embedded in SQLiteAdmin. Pete lcSQL Software <http://www.lcsql.com> On Mon, May 7, 2012 at 4:30 PM, Alex Tweedly <a...@tweedly.net> wrote: > Some years ago, this list discussed the difficulties of parsing > comma-separated-value file format; Richard Gaskin has a great article about > it at > http://www.fourthworld.com/**embassy/articles/csv-must-die.**html<http://www.fourthworld.com/embassy/articles/csv-must-die.html> > > Following that discussion, I came up with some code to parse CSV in > Livecode which was significantly faster than the straightforwards methods > (quoted in the above article). At the time, I put that speed gain down to > two factors > > 1. a way of looking at the problem "sideways" that enables a different > approach > 2. a 'clever' use of split + array access > > Recently the topic came up again, and I looked at the code again; I now > realize that in fact the speed gain came entirely from the first of those > two factors, and using split + arrays was not helpful. Livecode's chunk > handling is (in this case) faster than using arrays (my only excuse is that > I was new to Livecode, and so I was using techniques I was familiar with > from other languages). So I revised the code to use chunk handling rather > than split+arrays, and the resulting code runs about 40% faster, with the > added benefit of being slightly easier to read and understand. The only > slightly mind-bending feature of the new code is the use of > > set the lineDelimiter to quote > repeat for each line k in pData .... > > I find it hard to think about "lines" that aren't actually lines :-) > > So - for anyone who needs or wants more speed, here's the code > > function CSV3Tab pData,pcoldelim >> local tNuData -- contains tabbed copy of data >> local tReturnPlaceholder -- replaces cr in field data to avoid line >> -- breaks which would be misread as records; >> -- replaced later during dislay >> local tEscapedQuotePlaceholder -- used for keeping track of quotes >> -- in data >> local tInQuotedText -- flag set while reading data between quotes >> local tInsideQuoted, k >> -- >> put numtochar(11) into tReturnPlaceholder -- vertical tab as >> -- placeholder >> put numtochar(2) into tEscapedQuotePlaceholder -- used to simplify >> -- distinction between quotes in data and those >> -- used in delimiters >> -- >> if pcoldelim is empty then put comma into pcoldelim >> -- Normalize line endings: >> replace crlf with cr in pData -- Win to UNIX >> replace numtochar(13) with cr in pData -- Mac to UNIX >> -- >> -- Put placeholder in escaped quote (non-delimiter) chars: >> replace ("\""e) with tEscapedQuotePlaceholder in pData >> replace quote"e with tEscapedQuotePlaceholder in pData >> -- >> put space before pData -- to avoid ambiguity of starting context >> put False into tInsideQuoted >> set the linedel to quote >> repeat for each line k in pData >> if (tInsideQuoted) then >> replace cr with tReturnPlaceholder in k >> put k after tNuData >> put False into tInsideQuoted >> else >> replace pcoldelim with numtochar(29) in k >> put k after tNuData >> put true into tInsideQuoted >> end if >> end repeat >> -- >> delete char 1 of tNuData -- remove the leading space >> replace tEscapedQuotePlaceholder with quote in tNuData >> return tNuData >> end CSV3Tab >> >> > -- Alex. > > ______________________________**_________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode> > _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode