Re: Odd csv column-name truncation with only one column

Hans Mulder Fri, 20 Jul 2012 10:03:11 -0700

On 19/07/12 23:10:04, Dennis Lee Bieber wrote:
> On Thu, 19 Jul 2012 13:01:37 -0500, Tim Chase
> <[email protected]> declaimed the following in
> gmane.comp.python.general:
> 
>>  It just seems unfortunate that the sniffer would ever consider
>> [a-zA-Z0-9] as a valid delimiter.


+1

>       I'd suspect the sniffer logic does not do any special casing
> -- any /byte value/ is a candidate for the delimiter.

The sniffer prefers [',', '\t', ';', ' ', ':'] (in that order).
If none of those is found, it goes to the other extreme and considers
all characters equally likely.

> This would allow for usage of some old ASCII control characters --
> things like  x1F (unit separator)

If the Sniffer excludes [a-zA-Z0-9] (or all alphanumerics) as
potential delimiters, than control characters such as "\x1F" are
still possible.

> {Next is to rig the sniffer to identify x1F for fields, and x1E
> for records <G>}

The sniffer will always guess '\r\n' as the line terminator.

That should not stop you from creating a dialect with '\x1E' as
the line terminator.  Just don't expect the sniffer to recognize
that dialect.

-- HansM


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Odd csv column-name truncation with only one column

Reply via email to