Antoon Pardon added the comment: I had a look at this and have the following remarks.
1) the file csv_sniffing_excel_tab.py no longer works with python 3.3. It now produces the folowing traceback: Traceback (most recent call last): File "csv_sniffing_excel_tab.py", line 36, in <module> create_file() File "csv_sniffing_excel_tab.py", line 23, in create_file writer.writerows(test_data) TypeError: 'str' does not support the buffer interface 2) The problem seems to be in the _guess_quote_and_delimiter method. If you always call _guess_delimiter, the sniffer give the correct result. 3) As far as I understand the problem is the first regular expression: (?P<delim>[^\w\n"\'])(?P<space> ?)(?P<quote>["\']).*?(?P=quote)(?P=delim) Now if we have a line as the following 273:MVREGR1:ByEuPo:"Baryton ""Euphonium"" populaire" The delim group will match the space, the space group will match nothing the quote group will match " the non-group pattern will match "Euphonium" followed by the quote group matching " again and the delim group matching the space. And so we get the wrong delimiter. ---------- nosy: +Antoon.Pardon _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17829> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com