STINNER Victor <victor.stin...@haypocalc.com> added the comment: > > It looks like your patch fixes #10829: you should add tests for that, you > > can just reuse the tests of my patch (attached to #10829). > > Sorry, but I think my patch doesn't fix #10829.
Ah ok, so don't add failing tests :-) > > You should also avoid the creation of a temporary unicode object (it can be > > slow if precision is large) using PySequence_GetSlice(). Py_UNICODE_COPY() > > does already truncate the string because you can pass an arbitrary length. > > In order to use Py_UNICODE_COPY, I have to create a unicode object with > required length first. No you don't. You can copy a substring of the input string with Py_UNICODE_COPY: just pass a smaller length. > > With your patch, "%.200s" truncates the input string to 200 *characters*, > > but I think that it should truncate to 200 *bytes*, as printf does. > > Sorry, I don't understand. The result of PyUnicode_FromFormatV() is a unicode > object. Then how to truncate to 200 *bytes*? You can truncate the input char* on the call to PyUnicode_DecodeUTF8: pass a size smaller than strlen(s). case 's': { /* UTF-8 */ const char *s = va_arg(count, const char*); PyObject *str = PyUnicode_DecodeUTF8(s, strlen(s), "replace"); if (!str) goto fail; n += PyUnicode_GET_SIZE(str); /* Remember the str and switch to the next slot */ *callresult++ = str; break; } I don't know if we should truncate to a number of bytes, or a number of characters. > > I don't like this change because I hate having to compute manually strings > > length. It should that it would be easier if you format directly strings > > with width and precision at step 3, instead of doing it at step 4: so you > > can just read the length of the formatted string, and it avoids having to > > handle width/precision in two steps (which may be inconsistent :-/). > > Do you mean combine step 3 and step 4 together? Currently step 3 is just to > compute the biggest width value and step 4 is to compute exact width and do > the real format work. Only by doing real format we can get the exact width of > a string. So I have to compute each width twice in both step 3 and step 4. Is > combining the two steps in to one a good idea? "Do you mean combine step 3 and step 4 together?" Yes, but I am no more sure that it is the right thing to do. > > Can you add tests for "%.s"? I would like to know if "%.s" is different > > than "%s" :-) > > Err, '%.s' causes unexpected result both with and without my patch. Maybe > it's still another bug? If the fix (always have the same behaviour) is short, it would be nice to include it in your patch. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue7330> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com