> On 21 Nov 2014, at 02:14, Christoph Becker <cmbecke...@gmx.de> wrote:
> 
> Tjerk Meesters wrote:
> 
>>> On 20 Nov 2014, at 00:26, Christoph Becker <cmbecke...@gmx.de> wrote:
>>> 
>>> Are you aware of <https://bugs.php.net/bug.php?id=50686>?  It seems this
>>> very inconsistency has been reported a few years ago, but has been
>>> tagged as "Wont fix" back then.
>> 
>> Actually that bug report seems to suggest that fputcsv() uses backslash to 
>> encode enclosure characters, but AFAICT it doesn’t.
> 
> Apparently, there is a somewhat hidden bug, see <http://3v4l.org/El5Xs>
> for a simplified test script.  The expected result is
> 
>  string(14) ""a""b","a\""b""
> 
> or maybe
> 
>  string(14) ""a\"b","a\\"b""
> 
> The actual result makes no sense to me, even though str_getcsv() parses
> it "correctly”.

That works exactly for the wrong reasons:
1) upon seeing an escape character fgetcsv() will print that and the following 
character
2) fputcsv() actually accepts an escape character too (despite what the 
documentation says) but treats it in the wrong way by not escaping that and the 
following character

The expected output, based on the given code should (imo) be:

string(15) ""a\"b","a\\\"b""

Or: if the escape character is a double quote:

string(15) “"a""b",”a\""b””

Unfortunately I can’t satisfy all the related bug reports, some decision of 
“correctness” needs to be made in the form of an RFC.

> 
>> And then there are bug reports like https://bugs.php.net/bug.php?id=67566 
>> which were fixed but really just made the situation worse =(
> 
> ACK.
> 
>>> <https://bugs.php.net/bug.php?id=38929> also seems to deal with this
>>> inconsistency, and had been tagged as "Not a bug".
>>> 
>>> So maybe an RFC is appropriate?
>> 
>> Yeah, I didn’t realise the can of worms until I opened it; I’ll round up all 
>> the bug reports and run them against whatever RFC I can get my hands on.
>> 
>> PS: Favourite quote from the semi-authoritative spec of Perl_CSV: 
>> http://rath.ca/Misc/Perl_CSV/CSV-2.0.html#csv:
>> 
>>> Given that the essence of CSV files is simplicity, I have decided to reject 
>>> all escape and escaped characters with the exception of quoation marks 
>>> appearing within quotation marks …
>> 
>> Good times :)
> 
> One might argue that the essence of CSV files is being a data exchange
> format, so applying Postel's law would be reasonable. :)
> 
> -- 
> Christoph M. Becker
> 


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to