How about we add mb_fgetcsv(), which would have full multi-byte support
(including delimeters). I'd imagine for people who need to parse multi-byte
csv files, full functionality is more important then speed. As for the
fgetcsv() in ext/standard/, we can port the 4.3.X code (copy & paste really)
and let PHP 5 users benefit from a faster fgetcsv() for common applications.
What do you think?
I disagree, because of the following reasons:
1) Not a few people *actually* use fgetcsv() commonly with multibyte characters indeed. Regarding this, applications made by those who don't use such characters don't (and won't) use multibyte specific functions and that's the problem. This greatly prevents them from being portable. 2) IMO speed is not a key factor here. People rather wants trust-worthy behaviour. 3) fgetcsv() implementation in the stable branch is now too complicated to add a new feature to and also hard to maintain. We should be able to eliminate the mblen() calls for acceptable performance. See the attached result.
Moriyoshi
p.s. fgetcsv() in the stable branch still seems to segfault with the attached test case (segfault.php.txt).
[The benchmark result] My code with mblen() (on php5-csv):
real 0m1.389s user 0m1.330s sys 0m0.060s
Ditto without mblen():
real 0m0.396s user 0m0.350s sys 0m0.040s
Your code (on php4-csv):
real 0m0.332s user 0m0.270s sys 0m0.060s
<?php $file = '/tmp/test.csv'; $fp = fopen($file, 'w'); fwrite($fp, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb, cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc,ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd\n"); fclose($fp); for ($i = 0; $i < 4000; $i++) { $fp = fopen($file, 'r'); fseek($fp, SEEK_SET, 0); fgetcsv($fp, filesize($file)); fclose($fp); } ?>
Index: ext/standard/php_string.h =================================================================== RCS file: /repository/php-src/ext/standard/php_string.h,v retrieving revision 1.83 diff -u -r1.83 php_string.h --- ext/standard/php_string.h 10 Dec 2003 21:23:35 -0000 1.83 +++ ext/standard/php_string.h 12 Dec 2003 21:16:09 -0000 @@ -144,15 +144,7 @@ #define strerror php_strerror #endif -#ifndef HAVE_MBLEN -# define php_mblen(ptr, len) 1 -#else -# if defined(_REENTRANT) && defined(HAVE_MBRLEN) && defined(HAVE_MBSTATE_T) -# define php_mblen(ptr, len) ((ptr) == NULL ? mbsinit(&BG(mblen_state)): (int)mbrlen(ptr, len, &BG(mblen_state))) -# else -# define php_mblen(ptr, len) mblen(ptr, len) -# endif -#endif +#define php_mblen(ptr, len) 1 void register_string_constants(INIT_FUNC_ARGS);
<?php $file = '/tmp/test.csv'; $fp = fopen($file, 'w'); fwrite($fp, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb, cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc, \"ddddddddddddd\\\"dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd\"\n"); fclose($fp); $fp = fopen($file, 'r'); fseek($fp, SEEK_SET, 0); fgetcsv($fp, filesize($file)); fclose($fp); ?>
-- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php