John Machin wrote: > John Machin wrote: > > Fredrik Lundh wrote: > > > robert wrote: > > > > > > > What is a most simple expression for splitting a CSV line > > > > with "-protected fields? > > > > > > > > s='"123","a,b,\"c\"",5.640' > > > > > > import csv > > > > > > the preferred way is to read the file using that module. if you insist > > > on processing a single line, you can do > > > > > > cols = list(csv.reader([string])) > > > > > > </F> > > > > Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit > > (Intel)] on win32 > > | >>> import csv > > | >>> s='"123","a,b,\"c\"",5.640' > > | >>> cols = list(csv.reader([s])) > > | >>> cols > > [['123', 'a,b,c""', '5.640']] > > # maybe we need a bit more: > > | >>> cols = list(csv.reader([s]))[0] > > | >>> cols > > ['123', 'a,b,c""', '5.640'] > > > > I'd guess that the OP is expecting 'a,b,"c"' for the second field. > > > > Twiddling with the knobs doesn't appear to help: > > > > | >>> list(csv.reader([s], escapechar='\\'))[0] > > ['123', 'a,b,c""', '5.640'] > > | >>> list(csv.reader([s], escapechar='\\', doublequote=False))[0] > > ['123', 'a,b,c""', '5.640'] > > > > Looks like a bug to me; AFAICT from the docs, the last attempt should > > have worked. > > Given Peter Otten's post, looks like > (1) there's a bug in the "fmtparam" mechanism -- it's ignoring the > escapechar in my first twiddle, which should give the same result as > Peter's. > (2) > | >>> csv.excel.doublequote > True > According to my reading of the docs: > """ > doublequote > Controls how instances of quotechar appearing inside a field should be > themselves be quoted. When True, the character is doubled. When False, > the escapechar is used as a prefix to the quotechar. It defaults to > True. > """ > Peter's example should not have worked.
Doh. The OP's string was a raw string. I need some sleep. Scrap bug #1! | >>> s=r'"123","a,b,\"c\"",5.640' | >>> list(csv.reader([s]))[0] ['123', 'a,b,\\c\\""', '5.640'] # What's that??? | >>> list(csv.reader([s], escapechar='\\'))[0] ['123', 'a,b,"c"', '5.640'] | >>> list(csv.reader([s], escapechar='\\', doublequote=False))[0] ['123', 'a,b,"c"', '5.640'] And there's still the problem with doublequote .... Goodnight ... -- http://mail.python.org/mailman/listinfo/python-list