Re: "convert" string to bytes without changing data (encoding)

2012-08-29 Thread Nobody
On Wed, 29 Aug 2012 19:39:15 -0400, Piet van Oostrum wrote: >> Reading from stdin/a file gets you bytes, and not a string, because >> Python cannot automagically guess what format the input is in. >> > Huh? Oh, it can certainly guess (in the absence of any other information, it uses the current l

Re: "convert" string to bytes without changing data (encoding)

2012-08-29 Thread Piet van Oostrum
Heiko Wundram writes: > Reading from stdin/a file gets you bytes, and > not a string, because Python cannot automagically guess what format the > input is in. > Huh? Python 3.3.0rc1 (v3.3.0rc1:8bb5c7bc46ba, Aug 25 2012, 10:09:29) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help"

Re: "convert" string to bytes without changing data (encoding)

2012-08-29 Thread Piet van Oostrum
Ross Ridge writes: > > But it is in fact only stored in one particular way, as a series of bytes. > No, it can be stored in different ways. Certainly in Python 3.3 and beyond. And in 3.2 also, depending on wide/narrow build. -- Piet van Oostrum WWW: http://pietvanoostrum.com/ PGP key: [8DAE142B

Re: "convert" string to bytes without changing data (encoding)

2012-03-30 Thread Chris Angelico
On Sat, Mar 31, 2012 at 6:06 AM, Serhiy Storchaka wrote: > 28.03.12 21:13, Heiko Wundram написав(ла): > >> Reading from stdin/a file gets you bytes, and >> not a string, because Python cannot automagically guess what format the >> input is in. > > > In Python3 reading from stdin gets you string. U

Re: "convert" string to bytes without changing data (encoding)

2012-03-30 Thread Serhiy Storchaka
28.03.12 21:13, Heiko Wundram написав(ла): Reading from stdin/a file gets you bytes, and not a string, because Python cannot automagically guess what format the input is in. In Python3 reading from stdin gets you string. Use sys.stdin.buffer.raw for access to byte stream. And reading from file

Re: "convert" string to bytes without changing data (encoding)

2012-03-30 Thread Michael Ströder
Steven D'Aprano wrote: > On Thu, 29 Mar 2012 17:36:34 +, Prasad, Ramit wrote: > Technically, ASCII goes up to 256 but they are not A-z letters. >>> Technically, ASCII is 7-bit, so it goes up to 127. >> >>> No, ASCII only defines 0-127. Values >=128 are not ASCII. >>> >>> >From https

Re: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Steven D'Aprano
On Thu, 29 Mar 2012 11:30:19 -0400, Ross Ridge wrote: > Steven D'Aprano wrote: >>Your reaction is to make an equally unjustified estimate of Evan's >>mindset, namely that he is not just wrong about you, but *deliberately >>and maliciously* lying about you in the full knowledge that he is wrong.

Re: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Steven D'Aprano
On Thu, 29 Mar 2012 17:36:34 +, Prasad, Ramit wrote: >> > Technically, ASCII goes up to 256 but they are not A-z letters. >> > >> Technically, ASCII is 7-bit, so it goes up to 127. > >> No, ASCII only defines 0-127. Values >=128 are not ASCII. >> >> >From https://en.wikipedia.org/wiki/ASCII

Re: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Chris Angelico
On Fri, Mar 30, 2012 at 5:00 AM, Ross Ridge wrote: > Sorry, it would've been more accurate to label the flavour of kool-aid > Chris Angelico was trying to push as "it's impossible ... without > encoding": > >        What is a string? It's not a series of bytes. You can't convert >        it withou

Re: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Ross Ridge
Ross Ridge wrote: > Just because I refuse to drink the > "it's impossible to represent strings as a series of bytes" kool-aid Terry Reedy wrote: >I do not believe *anyone* has made that claim. Is this meant to be a >wild exaggeration? As wild as Evan's? Sorry, it would've been more accurate to

RE: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Prasad, Ramit
nces+ramit.prasad=jpmorgan@python.org] On > Behalf Of MRAB > Sent: Wednesday, March 28, 2012 2:50 PM > To: python-list@python.org > Subject: Re: "convert" string to bytes without changing data (encoding) > > On 28/03/2012 20:02, Prasad, Ramit wrote: > >&

Re: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Terry Reedy
On 3/29/2012 11:30 AM, Ross Ridge wrote: No, Evan in his own words admitted that his post was ment to be harsh, I agree that he should have restrained and censored his writing. Just because I refuse to drink the > "it's impossible to represent strings as a series of bytes" kool-aid I do no

Re: Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Evan Driscoll
On 01/-10/-28163 01:59 PM, Ross Ridge wrote: Evan Driscoll wrote: People like you -- who write to assumptions which are not even remotely guaranteed by the spec -- are part of the reason software sucks. ... This email is a bit harsher than it deserves -- but I feel not by much. I don't see

Re: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Ross Ridge
Steven D'Aprano wrote: >Your reaction is to make an equally unjustified estimate of Evan's >mindset, namely that he is not just wrong about you, but *deliberately >and maliciously* lying about you in the full knowledge that he is wrong. No, Evan in his own words admitted that his post was men

Re: "convert" string to bytes without changing data (encoding)

2012-03-29 Thread Peter Daum
On 2012-03-28 23:37, Terry Reedy wrote: > 2. Decode as if the text were latin-1 and ignore the non-ascii 'latin-1' > chars. When done, encode back to 'latin-1' and the non-ascii chars will > be as they originally were. ... actually, in the beginning of my quest, I ran into an decoding exception tr

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 23:58:53 -0400, Ross Ridge wrote: > How does that in anyway justify Evan Driscoll maliciously lying about > code he's never seen? You are perfectly justified to complain about Evan making sweeping generalisations about your code when he has not seen it; you are NOT justified

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Mark Lawrence
On 29/03/2012 04:58, Ross Ridge wrote: Chris Angelico wrote: Actually, he is justified. It's one thing to work in C or assembly and write code that depends on certain bit-pattern representations of data (although even that causes trouble - assuming that sizeof(int)=3D=3Dsizeof(int*) isn't good

Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Chris Angelico wrote: >Actually, he is justified. It's one thing to work in C or assembly and >write code that depends on certain bit-pattern representations of data >(although even that causes trouble - assuming that >sizeof(int)=3D=3Dsizeof(int*) isn't good for portability), but in a high >leve

Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Chris Angelico
On Thu, Mar 29, 2012 at 2:04 PM, Ross Ridge wrote: > Evan Driscoll   wrote: >>People like you -- who write to assumptions which are not even remotely >>guaranteed by the spec -- are part of the reason software sucks. > ... >>This email is a bit harsher than it deserves -- but I feel not by much. >

Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Evan Driscoll wrote: >People like you -- who write to assumptions which are not even remotely >guaranteed by the spec -- are part of the reason software sucks. ... >This email is a bit harsher than it deserves -- but I feel not by much. I don't see how you could feel the least bit justified. We

Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Evan Driscoll
On 3/28/2012 14:43, Ross Ridge wrote: > Evan Driscoll wrote: >> So yes, you can say that pretending there's not a mapping of strings to >> internal representation is silly, because there is. However, there's >> nothing you can say about that mapping. > > I'm not the one labeling anything as be

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 15:43:31 -0400, Ross Ridge wrote: > I can in > fact say what the internal byte string representation of strings is any > given build of Python 3. Don't keep us in suspense! Given: Python 3.2.2 (default, Mar 4 2012, 10:50:33) [GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Terry Reedy
On 3/28/2012 1:43 PM, Peter Daum wrote: The longer story of my question is: I am new to python (obviously), and since I am not familiar with either one, I thought it would be advisory to go for python 3.x. I strongly agree with that unless you have reason to use 2.7. Python 3.3 (.0a1 in nearl

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Neil Cerutti
On 2012-03-28, Ross Ridge wrote: > Evan Driscoll wrote: >> So yes, you can say that pretending there's not a mapping of >> strings to internal representation is silly, because there is. >> However, there's nothing you can say about that mapping. > > I'm not the one labeling anything as being sil

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Mark Lawrence
On 28/03/2012 20:43, Ross Ridge wrote: Evan Driscoll wrote: So yes, you can say that pretending there's not a mapping of strings to internal representation is silly, because there is. However, there's nothing you can say about that mapping. I'm not the one labeling anything as being silly. I

Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Evan Driscoll wrote: >So yes, you can say that pretending there's not a mapping of strings to >internal representation is silly, because there is. However, there's >nothing you can say about that mapping. I'm not the one labeling anything as being silly. I'm the one labeling the things as bul

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Grant Edwards
On 2012-03-28, Prasad, Ramit wrote: > >>You can't generally just "deal with the ascii portions" without >>knowing something about the encoding. Say you encounter a byte >>greater than 127. Is it a single non-ASCII character, or is it the >>leading byte of a multi-byte character? If the next ch

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread MRAB
On 28/03/2012 20:02, Prasad, Ramit wrote: >The right way to convert bytes to strings, and vice versa, is via >encoding and decoding operations. If you want to dictate to the original poster the correct way to do things then you don't need to do anything more that. You don't need to pretend

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Grant Edwards
On 2012-03-28, Steven D'Aprano wrote: > On Wed, 28 Mar 2012 19:43:36 +0200, Peter Daum wrote: > >> The longer story of my question is: I am new to python (obviously), and >> since I am not familiar with either one, I thought it would be advisory >> to go for python 3.x. The biggest problem that I

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread John Nagle
On 3/28/2012 10:43 AM, Peter Daum wrote: On 2012-03-28 12:42, Heiko Wundram wrote: Am 28.03.2012 11:43, schrieb Peter Daum: The longer story of my question is: I am new to python (obviously), and since I am not familiar with either one, I thought it would be advisory to go for python 3.x. The

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ethan Furman
Prasad, Ramit wrote: You can read as bytes and decode as ASCII but ignoring the troublesome non-text characters: print(open('text.txt', 'br').read().decode('ascii', 'ignore')) Das fr ASCII nicht benutzte Bit kann auch fr Fehlerkorrekturzwecke (Parittsbit) auf den Kommunikationsleitungen oder f

RE: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Prasad, Ramit
> >The right way to convert bytes to strings, and vice versa, is via > >encoding and decoding operations. > > If you want to dictate to the original poster the correct way to do > things then you don't need to do anything more that. You don't need to > pretend like Chris Angelico that there's isn

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Albert W. Hopkins
On Wed, 2012-03-28 at 14:05 -0400, Ross Ridge wrote: > Ross Ridge wr= > > Of course it is. =A0Conceptually you're not supposed to think of it that > > way, but a string is stored in memory as a series of bytes. > > Chris Angelico wrote: > >Note that distinction. I said that a string "is not" a

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Tim Chase wrote: >Internally, they're a series of bytes, but they are MEANINGLESS >bytes unless you know how they are encoded internally. Those >bytes could be UTF-8, UTF-16, UTF-32, or any of a number of other >possible encodings[1]. If you get the internal byte stream, >there's no way to

Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Evan Driscoll
On 01/-10/-28163 01:59 PM, Ross Ridge wrote: Steven D'Aprano wrote: The right way to convert bytes to strings, and vice versa, is via encoding and decoding operations. If you want to dictate to the original poster the correct way to do things then you don't need to do anything more that. You

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Tim Chase
On 03/28/12 13:05, Ross Ridge wrote: Ross Ridge wr= But a Python Unicode string might be stored in several ways; for all you know, it might actually be stored as a sequence of apples in a refrigerator, just as long as they can be referenced correctly. But it is in fact only stored in one part

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ethan Furman
Peter Daum wrote: On 2012-03-28 12:42, Heiko Wundram wrote: Am 28.03.2012 11:43, schrieb Peter Daum: ... in my example, the variable s points to a "string", i.e. a series of bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters. No; a string contains a series of codepoints from the un

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Steven D'Aprano wrote: >The right way to convert bytes to strings, and vice versa, is via >encoding and decoding operations. If you want to dictate to the original poster the correct way to do things then you don't need to do anything more that. You don't need to pretend like Chris Angelico th

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 19:43:36 +0200, Peter Daum wrote: > The longer story of my question is: I am new to python (obviously), and > since I am not familiar with either one, I thought it would be advisory > to go for python 3.x. The biggest problem that I am facing is, that I am > often dealing with

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Terry Reedy
On 3/28/2012 11:36 AM, Ross Ridge wrote: Chris Angelico wrote: What is a string? It's not a series of bytes. Of course it is. Conceptually you're not supposed to think of it that way, but a string is stored in memory as a series of bytes. *If* it is stored in byte memory. If you execute

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ian Kelly
On Wed, Mar 28, 2012 at 11:43 AM, Peter Daum wrote: > ... I was under the illusion, that python (like e.g. perl) stored > strings internally in utf-8. In this case the "conversion" would simple > mean to re-label the data. Unfortunately, as I meanwhile found out, this > is not the case (nor the "a

RE: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Prasad, Ramit
> You can read as bytes and decode as ASCII but ignoring the troublesome > non-text characters: > > >>> print(open('text.txt', 'br').read().decode('ascii', 'ignore')) > Das fr ASCII nicht benutzte Bit kann auch fr Fehlerkorrekturzwecke > (Parittsbit) auf den Kommunikationsleitungen oder fr andere

RE: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Prasad, Ramit
> As it seems, this would be far easier with python 2.x. With python 3 > and its strict distinction between "str" and "bytes", things gets > syntactically pretty awkward and error-prone (something as innocently > looking like "s=s+'/'" hidden in a rarely reached branch and a > seemingly correct pro

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Jussi Piitulainen
Peter Daum writes: > ... I was under the illusion, that python (like e.g. perl) stored > strings internally in utf-8. In this case the "conversion" would simple > mean to re-label the data. Unfortunately, as I meanwhile found out, this > is not the case (nor the "apple encoding" ;-), so it would i

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 11:43:52 +0200, Peter Daum wrote: > ... in my example, the variable s points to a "string", i.e. a series of > bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters. No. Strings are not sequences of bytes (except in the trivial sense that everything in computer memor

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Ross Ridge wr= > Of course it is. =A0Conceptually you're not supposed to think of it that > way, but a string is stored in memory as a series of bytes. Chris Angelico wrote: >Note that distinction. I said that a string "is not" a series of >bytes; you say that it "is stored" as bytes. The dist

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Heiko Wundram
Am 28.03.2012 19:43, schrieb Peter Daum: As it seems, this would be far easier with python 2.x. With python 3 and its strict distinction between "str" and "bytes", things gets syntactically pretty awkward and error-prone (something as innocently looking like "s=s+'/'" hidden in a rarely reached b

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 11:36:10 -0400, Ross Ridge wrote: > Chris Angelico wrote: >>What is a string? It's not a series of bytes. > > Of course it is. Conceptually you're not supposed to think of it that > way, but a string is stored in memory as a series of bytes. You don't know that. They might

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Peter Daum
On 2012-03-28 12:42, Heiko Wundram wrote: > Am 28.03.2012 11:43, schrieb Peter Daum: >> ... in my example, the variable s points to a "string", i.e. a series of >> bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters. > > No; a string contains a series of codepoints from the unicode plan

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Dave Angel
On 03/28/2012 04:56 AM, Peter Daum wrote: Hi, is there any way to convert a string to bytes without interpreting the data in any way? Something like: s='abcde' b=bytes(s, "unchanged") Regards, Peter You needed to specify that you are using Python 3.x . In pyt

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Grant Edwards
On 2012-03-28, Chris Angelico wrote: > for all you know, it might actually be stored as a sequence of > apples in a refrigerator [...] > There's no logical Python way to turn that into a series of bytes. There's got to be a joke there somewhere about how to eat an apple... -- Grant Edwards

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Chris Angelico
On Thu, Mar 29, 2012 at 2:36 AM, Ross Ridge wrote: > Chris Angelico   wrote: >>What is a string? It's not a series of bytes. > > Of course it is.  Conceptually you're not supposed to think of it that > way, but a string is stored in memory as a series of bytes. Note that distinction. I said that

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Chris Angelico wrote: >What is a string? It's not a series of bytes. Of course it is. Conceptually you're not supposed to think of it that way, but a string is stored in memory as a series of bytes. What he's asking for many not be very useful or practical, but if that's your problem here than

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Stefan Behnel
Peter Daum, 28.03.2012 11:43: > What I am looking for is a general way to just copy the raw data > from a "string" object to a "byte" object without any attempt to > "decode" or "encode" anything ... That's why I asked about your use case - where does the data come from and why is it contained in

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Heiko Wundram
Am 28.03.2012 11:43, schrieb Peter Daum: ... in my example, the variable s points to a "string", i.e. a series of bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters. No; a string contains a series of codepoints from the unicode plane, representing natural language characters (at l

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Peter Daum
On 2012-03-28 11:02, Chris Angelico wrote: > On Wed, Mar 28, 2012 at 7:56 PM, Peter Daum wrote: >> is there any way to convert a string to bytes without >> interpreting the data in any way? Something like: >> >> s='abcde' >> b=bytes(s, "unchanged") > > What is a string? It's not a series of bytes

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Stefan Behnel
Peter Daum, 28.03.2012 10:56: > is there any way to convert a string to bytes without > interpreting the data in any way? Something like: > > s='abcde' > b=bytes(s, "unchanged") If you can tell us what you actually want to achieve, i.e. why you want to do this, we may be able to tell you how to d

Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Chris Angelico
On Wed, Mar 28, 2012 at 7:56 PM, Peter Daum wrote: > Hi, > > is there any way to convert a string to bytes without > interpreting the data in any way? Something like: > > s='abcde' > b=bytes(s, "unchanged") What is a string? It's not a series of bytes. You can't convert it without encoding those

"convert" string to bytes without changing data (encoding)

2012-03-28 Thread Peter Daum
Hi, is there any way to convert a string to bytes without interpreting the data in any way? Something like: s='abcde' b=bytes(s, "unchanged") Regards, Peter -- http://mail.python.org/mailman/listinfo/python-list