[issue7330] PyUnicode_FromFormat segfault

Ray.Allen Thu, 17 Feb 2011 06:08:52 -0800

Ray.Allen <ysj....@gmail.com> added the comment:

Thanks hyapo!


> It looks like your patch fixes #10829: you should add tests for that, you can 
> just reuse the tests of my patch (attached to #10829).

Sorry, but I think my patch doesn't fix #10829. It seems link another issue. 
And by applying my patch and add tests from #10829's patch, the tests cannot 
passed. Or did I missed something?


> You should also avoid the creation of a temporary unicode object (it can be 
> slow if precision is large) using PySequence_GetSlice(). Py_UNICODE_COPY() 
> does already truncate the string because you can pass an arbitrary length.

In order to use Py_UNICODE_COPY, I have to create a unicode object with 
required length first. I feel this have the same cost as using 
PySequence_GetSlice(). If I understand correctly?


> With your patch, "%.200s" truncates the input string to 200 *characters*, but 
> I think that it should truncate to 200 *bytes*, as printf does.

Sorry, I don't understand. The result of PyUnicode_FromFormatV() is a unicode 
object. Then how to truncate to 200 *bytes*? I think the %s formatter just 
indicate that the argument is c-style chars, the result is always unicode 
string, and the width and precision formatters are to applied after converting 
c-style chars to string. 


> I don't like this change because I hate having to compute manually strings 
> length. It should that it would be easier if you format directly strings with 
> width and precision at step 3, instead of doing it at step 4: so you can just 
> read the length of the formatted string, and it avoids having to handle 
> width/precision in two steps (which may be inconsistent :-/).

Do you mean combine step 3 and step 4 together? Currently step 3 is just to 
compute the biggest width value and step 4 is to compute exact width and do the 
real format work. Only by doing real format we can get the exact width of a 
string. So I have to compute each width twice in both step 3 and step 4. Is 
combining the two steps in to one a good idea?


> In my opinion, the patch is a little bit too big. We may first commit the fix 
> on the code parsing the width and precision: fix #10829?

Again, I guess #10829 need another its own patch to fix. 


> Can you add tests for "%.s"? I would like to know if "%.s" is different than 
> "%s" :-)

Err, '%.s' causes unexpected result both with and without my patch. Maybe it's 
still another bug?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue7330>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7330] PyUnicode_FromFormat segfault

Reply via email to