[issue13538] Docstring of str() and/or behavior

2011-12-06 Thread Guillaume Bouchard

New submission from Guillaume Bouchard :

The docstring associated with str() says:

  str(string[, encoding[, errors]]) -> str
  
  Create a new string object from the given encoded string.
  encoding defaults to the current default string encoding.
  errors can be 'strict', 'replace' or 'ignore' and defaults to 'strict'.

When it is stated in the on-line documentation::

  When only object is given, this returns its nicely printable representation.

My issue comes when I tried to convert bytes to str.

As stated in the documentation, and to avoid implicit behavior, converting str 
to bytes cannot be done without giving an encoding (using bytes(my_str, 
encoding=..) or my_str.encode(...). bytes(my_str) will raise a TypeError). But 
if you try to convert bytes to str using str(my_bytes), python will returns you 
the so-called nicely printable representation of the bytes object).

ie. ::


  >>> bytes("foo")
  Traceback (most recent call last):
File "", line 1, in 
  TypeError: string argument without an encoding
  >>> str(b"foo")
  "b'foo'"

As a matter of coherency and to avoid silent errors, I suggest that str() of a 
byte object without encoding raise an exception. I think it is usually what 
people want. If one wants a *nicely printable representation* of their bytes 
object, they can call explicitly the repr() function and will quickly see that 
what they just printed is wrong. But if they want to convert a byte object to 
its unicode representation, they will prefer an exception rather than a 
silently failing converting which leads to an unicode string starting with 'b"' 
and ending with '"'.

--
components: Interpreter Core
messages: 148914
nosy: Guillaume.Bouchard
priority: normal
severity: normal
status: open
title: Docstring of str() and/or behavior
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue13538>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13538] Docstring of str() and/or behavior

2011-12-06 Thread Guillaume Bouchard

Guillaume Bouchard  added the comment:

> str always falls back to the repr; in general str(obj) should always return 
> some value, otherwise the assumptions of a *lot* of Python code would be 
> broken.

Perhaps it may raises a warning ?

ie, the only reason encoding exists if for the conversion of bytes (or 
something which looks like bytes) to str. Do you think it may be possible to 
special case the use of str for bytes (and bytesarray) with something like this:

def str(object, encoding=None, errors=None):
if encoding is not None:
 # usual work
else:
   if isinstance(object, (bytes, bytesarray)):
 warning('Converting bytes/bytesarray to str without encoding, it 
may not be what you expect')
 return object.__str__()

But by the way, adding warnings and special case everywhere seems not too 
pythonic.

> Do you want to propose a doc patch?

The docstring for str() should looks like something like, in my frenglish way 
of writing english ::

  Create a new string object from the given encoded string.

  If object is bytes, bytesarray or a buffer-like object, encoding and error
  can be set. errors can be 'strict', 'replace' or 'ignore' and defaults to
  'strict'.

  WARNING, if encoding is not set, the object is converted to a nicely
  printable representation, which is totally different from what you may expect.

Perhaps a warning may be added in the on-line documentation, such as ::

  .. warning::
 When str() converts a bytes/bytesarray or a buffer-like object and
 *encoding* is not specified, the result will an unicode nicely printable
 representation, which is totally different from the unicode representation 
of
 you object using a specified encoding.

Whould you like a .diff on top of the current mercurial repository ?

--

___
Python tracker 
<http://bugs.python.org/issue13538>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com