[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-13 Thread Antoine Pitrou
Antoine Pitrou added the comment: Committed in r77461 (trunk), r77462 (py3k). Thank you very much! -- resolution: -> fixed status: open -> closed ___ Python tracker ___

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-05 Thread Florent Xicluna
Florent Xicluna added the comment: And the Py3k patch. (note: previous update v4b -> v4c minimize the differences between Py2 and Py3 implementations) -- versions: +Python 3.2 Added file: http://bugs.python.org/file15750/stringlib_split_replace_py3k.diff ___

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-05 Thread Florent Xicluna
Florent Xicluna added the comment: Slight update: * Objects/unicodeobject.c - moved STRINGLIB_ISLINEBREAK to unicodedefs.h - removed FROM_UNICODE: use STRINGLIB_IS_UNICODE instead * Objects/stringlib/find.h - use STRINGLIB_WANT_CONTAINS_OBJ in find.h (similar to current py3k impl

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-05 Thread Florent Xicluna
Changes by Florent Xicluna : Removed file: http://bugs.python.org/file15737/stringlib_split_replace_v4b.diff ___ Python tracker ___ ___ Python-

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: This looks generally good. Can you produce a separate patch for py3k? stringobject.c has been replaced with bytesobject.c there. -- ___ Python tracker

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Changes by Florent Xicluna : Removed file: http://bugs.python.org/file15741/issue7622_test_splitlines.diff ___ Python tracker ___ ___ Python-bu

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Changes by Florent Xicluna : Added file: http://bugs.python.org/file15744/issue7622_test_splitlines.diff ___ Python tracker ___ ___ Python-bugs

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Florent Xicluna added the comment: The test case for the previous issue. -- Added file: http://bugs.python.org/file15741/issue7622_test_splitlines.diff ___ Python tracker ___ ___

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Changes by Florent Xicluna : Removed file: http://bugs.python.org/file15735/stringlib_split_replace_v4.diff ___ Python tracker ___ ___ Python-b

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Florent Xicluna added the comment: Fixed a problem with the splitlines optimization: use PyList_Append instead of PyList_SET_ITEM because there's no preallocated list in this case. -- Added file: http://bugs.python.org/file15737/stringlib_split_replace_v4b.diff ___

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Eric Smith wrote: > > > > Eric Smith added the comment: > > > > I think we should use whatever style is currently being used in the code. > > If we want to go back through this code (or any other code) and PEP7-ify > > it, that should be a separate task

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Eric Smith
Eric Smith added the comment: I think we should use whatever style is currently being used in the code. If we want to go back through this code (or any other code) and PEP7-ify it, that should be a separate task. Alternately, we could PEP7-ify it first, then apply these changes. --

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Florent Xicluna wrote: > > > > Florent Xicluna added the comment: > > >> >> * function declarations should not put parameters on new lines: >> >> >> >> +stringlib_splitlines( >> >> +PyObject* str_obj, const STRINGLIB_CHAR* str, Py_ssize_t str_len, >

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: I don't think you should remove such blocks: -/* - Local variables: - c-basic-offset: 4 - indent-tabs-mode: nil - End: -*/ There probably are people relying on them :-) -- ___ Python tracker

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Florent Xicluna added the comment: And now, the figures. There's no gain for the string methods. Some unicode methods are faster: split/rsplit/replace: Most significant results: --- bench_slow.log Trunk +++ bench_fast.log Patched string unicode (ms) (ms)comment == late m

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Changes by Florent Xicluna : Removed file: http://bugs.python.org/file15734/stringlib_split_replace_v3c.diff ___ Python tracker ___ ___ Python-

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Florent Xicluna added the comment: Patch updated: * coding style * added macros BLOOM_ADD to unicodeobject.c and fastsearch.h (and removed LONG_BITMASK) -- Added file: http://bugs.python.org/file15735/stringlib_split_replace_v4.diff ___ Python t

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I copied the style of "stringlib/partition.h" for this part. > Should I update style of "partition.h" too? No, it's ok for stringlib to have its own consistent style and there's no reason to change it IMO. More interesting would be benchmark results showing

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Florent Xicluna added the comment: > * function declarations should not put parameters on new lines: > > +stringlib_splitlines( > +PyObject* str_obj, const STRINGLIB_CHAR* str, Py_ssize_t str_len, > +int keepends > +) > +{ I copied the style of "stringlib/partition.h" for this

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Florent Xicluna added the comment: > A few comments on coding style: Thank you for your remarks. I will update the patch accordingly. > * make sure that the name of a symbol matches the value, e.g. > > #define LONG_BITMASK (LONG_BIT-1) > #define BLOOM(mask, ch) ((mask & (1 << ((ch) & LONG_

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Eric Smith
Changes by Eric Smith : -- nosy: +eric.smith ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: A few comments on coding style: * please keep the existing argument formats as they are, e.g. count = countstring(self_s, self_len, from_s, from_len, 0, self_len, FORWARD, maxcount); or /* helper

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Changes by Florent Xicluna : Removed file: http://bugs.python.org/file15732/stringlib_split_replace_v3b.diff ___ Python tracker ___ ___ Python-

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Florent Xicluna added the comment: Refleak fixed in PyUnicode_Splitlines. -- stage: -> patch review Added file: http://bugs.python.org/file15734/stringlib_split_replace_v3c.diff ___ Python tracker

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-04 Thread Florent Xicluna
Florent Xicluna added the comment: There's some reference leaking somewhere... Will investigate. ~ $ ./python Lib/test/regrtest.py -R 2:3: test_unicode test_unicode leaked [7, 7, 7] references, sum=21 -- ___ Python tracker

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Florent Xicluna
Changes by Florent Xicluna : Removed file: http://bugs.python.org/file15730/stringlib_split_replace_v3.diff ___ Python tracker ___ ___ Python-b

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Florent Xicluna
Florent Xicluna added the comment: added "Makefile.pre.in". -- Added file: http://bugs.python.org/file15732/stringlib_split_replace_v3b.diff ___ Python tracker ___ __

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Florent Xicluna
Changes by Florent Xicluna : Removed file: http://bugs.python.org/file15727/stringlib_split_replace_v2.diff ___ Python tracker ___ ___ Python-b

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Florent Xicluna
Florent Xicluna added the comment: Mutable methods split() splitlines() and partition() fixed. And added optimization for all immutables methods. -- Added file: http://bugs.python.org/file15730/stringlib_split_replace_v3.diff ___ Python tracker

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +ezio.melotti priority: -> normal ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscr

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Antoine Pitrou
Antoine Pitrou added the comment: The patch looks wrong for bytearrays. They are mutable, so you shouldn't return the original object as an optimization. Here is the current (unpatched) behaviour: >>> a = bytearray(b"abc") >>> b, = a.split() >>> b is a False On the other hand, you aren't doi

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Florent Xicluna
Florent Xicluna added the comment: You're right. Oups. -- Added file: http://bugs.python.org/file15727/stringlib_split_replace_v2.diff ___ Python tracker ___

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Florent Xicluna
Changes by Florent Xicluna : Removed file: http://bugs.python.org/file15726/stringlib_split_replace.diff ___ Python tracker ___ ___ Python-bugs

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Antoine Pitrou
Antoine Pitrou added the comment: The "split.h" file is missing from your patch. -- nosy: +pitrou ___ Python tracker ___ ___ Python-bu

[issue7622] [patch] improve unicode methods: split() rsplit() and replace()

2010-01-03 Thread Florent Xicluna
New submission from Florent Xicluna : Content of the patch: - removed code duplication between bytearray/string/unicode - new header "stringlib/split.h" with common methods: stringlib_split/_rsplit/_splitlines - added "maxcount" argument to "stringlib_count" - better performance for split