[issue24787] csv.Sniffer guesses "M" instead of \t or , as the delimiter

Tiago Wright Wed, 05 Aug 2015 18:41:14 -0700

Tiago Wright added the comment:

I've run the Sniffer against 1614 csv files on my computer and compared the
delimiter it detects to what I have set manually. Here are the results:


 Sniffer            Human,;\t\(blank)Error:)ceMpGrand TotalError rate,498  2
110  1   5122.7%; 1          10.0%\t3 922 69121  227105412.5%|   33
330.0%space    91   4  1435.7%Grand Total5011922351610221142271614
-Tiago

On Tue, Aug 4, 2015 at 3:51 PM R. David Murray <[email protected]>
wrote:

>
> R. David Murray added the comment:
>
> If you look at the algorithm it is doing some fancy things with metrics,
> but does have a 'preferred delimiters' list that it checks.  It is possible
> things could be improved either by tweaking the threshold or by somehow
> giving added weight to the metrics when the candidate character is in the
> preferred delimiter list.
>
> We might have to do this with a feature flag to turn it on, though, since
> it could change the results for programs that happen to work with the
> current algorithm.
>
> ----------
> nosy: +r.david.murray
>
> _______________________________________
> Python tracker <[email protected]>
> <http://bugs.python.org/issue24787>
> _______________________________________
>

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue24787>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24787] csv.Sniffer guesses "M" instead of \t or , as the delimiter

Reply via email to