Terry Reedy writes:
> You need what is called, at least with Windows, an IME -- Input Method
> Editor.
For a GNOME or KDE environment you want an input method framework; I
recommend IBus http://code.google.com/p/ibus/> which comes with the
major GNU+Linux operating systems http://oswatershed.org
On 5/14/2011 3:41 AM, harrismh777 wrote:
Terry Reedy wrote:
Easy, practical use of unicode is still a work in progress.
Apparently... the good news for me is that SBL provides their unicode
font here:
http://www.sbl-site.org/educational/biblicalfonts.aspx
I'm getting much closer here, but
On 14 mai, 09:41, harrismh777 wrote:
> ...
> I'm getting much closer here,
> ...
You should really understand, that Unicode is a domain per
se. It is independent from any os's, programming languages
or applications. It is up to these tools to be "unicode"
compliant.
Working in a full unicode mo
On Fri, 13 May 2011 14:53:50 -0500, harrismh777 wrote:
> The unicode consortium is very careful to make sure that thousands
> of symbols have a unique code point (that's great !) but how do these
> thousands of symbols actually get displayed if there is no font
> consortium? Are there collec
Terry Reedy wrote:
Is there a unix linux package that can be installed that
drops at least 'one' default standard font that will be able to render
all or 'most' (whatever I mean by that) code points in unicode? Is this
a Python issue at all?
Easy, practical use of unicode is still a work in pro
On 5/13/2011 3:53 PM, harrismh777 wrote:
The unicode consortium is very careful to make sure that thousands of
symbols have a unique code point (that's great !) but how do these
thousands of symbols actually get displayed if there is no font
consortium? Are there collections of 'standard' fonts
On 5/13/11 2:53 PM, harrismh777 wrote:
The unicode consortium is very careful to make sure that thousands of symbols
have a unique code point (that's great !) but how do these thousands of symbols
actually get displayed if there is no font consortium? Are there collections of
'standard' fonts fo
jmfauth wrote:
to worry about encodings are when you're encoding unicode characters
> to byte strings, or decoding bytes to unicode characters
A small but important correction/clarification:
In Unicode, "unicode" does not encode a*character*. It
encodes a*code point*, a number, the integer as
On 12 mai, 18:17, Ian Kelly wrote:
> ...
> to worry about encodings are when you're encoding unicode characters
> to byte strings, or decoding bytes to unicode characters
A small but important correction/clarification:
In Unicode, "unicode" does not encode a *character*. It
encodes a *code poi
On Thu, May 12, 2011 at 2:42 PM, Terry Reedy wrote:
> On 5/12/2011 12:17 PM, Ian Kelly wrote:
>> Right. *Under the hood* Python uses UCS-2 (which is not exactly the
>> same thing as UTF-16, by the way) to represent Unicode strings.
>
> I know some people say that, but according to the definitions
On 5/12/2011 12:17 PM, Ian Kelly wrote:
On Thu, May 12, 2011 at 1:58 AM, John Machin wrote:
On Thu, May 12, 2011 4:31 pm, harrismh777 wrote:
So, the UTF-16 UTF-32 is INTERNAL only, for Python
NO. See one of my previous messages. UTF-16 and UTF-32, like UTF-8 are
encodings for the EXTERNAL
On Thu, May 12, 2011 at 1:58 AM, John Machin wrote:
> On Thu, May 12, 2011 4:31 pm, harrismh777 wrote:
>
>>
>> So, the UTF-16 UTF-32 is INTERNAL only, for Python
>
> NO. See one of my previous messages. UTF-16 and UTF-32, like UTF-8 are
> encodings for the EXTERNAL representation of Unicode charac
John Machin wrote:
> On Thu, May 12, 2011 2:14 pm, Benjamin Kaplan wrote:
>>
>> If the file you're writing to doesn't specify an encoding, Python will
>> default to locale.getdefaultencoding(),
>
> No such attribute. Perhaps you mean locale.getpreferredencoding()
what about sys.getfilesystemenco
On Thu, May 12, 2011 4:31 pm, harrismh777 wrote:
>
> So, the UTF-16 UTF-32 is INTERNAL only, for Python
NO. See one of my previous messages. UTF-16 and UTF-32, like UTF-8 are
encodings for the EXTERNAL representation of Unicode characters in byte
streams.
> I also was not aware that UTF-8 chars
Terry Reedy wrote:
It does not matter how Python stored the unicode internally. Does this
help? Your intent is signalled by how you open the file.
Very much, actually, thanks. I was missing the 'internal' piece, and
did not realize that if I didn't specify the encoding on the open that
pytho
Ben Finney wrote:
I'd phrase that as:
* Text is a sequence of characters. Most inputs to the program,
including files, sockets, etc., contain a sequence of bytes.
* Always know whether you're dealing with text or with bytes. No object
can be both.
* In Python 2, ‘str’ is the type f
John Machin wrote:
On Thu, May 12, 2011 2:14 pm, Benjamin Kaplan wrote:
If the file you're writing to doesn't specify an encoding, Python will
default to locale.getdefaultencoding(),
No such attribute. Perhaps you mean locale.getpreferredencoding()
>>> import locale
>>> locale.getpreferred
On Thu, May 12, 2011 2:14 pm, Benjamin Kaplan wrote:
>
> If the file you're writing to doesn't specify an encoding, Python will
> default to locale.getdefaultencoding(),
No such attribute. Perhaps you mean locale.getpreferredencoding()
--
http://mail.python.org/mailman/listinfo/python-list
On Wed, May 11, 2011 at 8:44 PM, harrismh777 wrote:
> Steven D'Aprano wrote:
>>>
>>> You need to understand the difference between characters and bytes.
>>
>> http://www.joelonsoftware.com/articles/Unicode.html
>>
>> is also a good resource.
>
> Thanks for being patient guys, here's what I've done
On Thu, May 12, 2011 1:44 pm, harrismh777 wrote:
> By
> default it looks like Python3 is writing output with UTF-8 as default...
> and I thought that by default Python3 was using either UTF-16 or UTF-32.
> So, I'm confused here... also, I used the character sequence \u00A3
> which I thought was UT
On 5/11/2011 11:44 PM, harrismh777 wrote:
Steven D'Aprano wrote:
You need to understand the difference between characters and bytes.
http://www.joelonsoftware.com/articles/Unicode.html
is also a good resource.
Thanks for being patient guys, here's what I've done:
astr="pound sign"
asym="
MRAB writes:
> You need to understand the difference between characters and bytes.
Yep. Those who don't need to join us in the third millennium, and the
resources pointed out in this thread are good to help that.
> A string contains characters, a file contains bytes.
That's not true for Python
On Thu, May 12, 2011 11:22 am, harrismh777 wrote:
> John Machin wrote:
>> (1) You cannot work without using bytes sequences. Files are byte
>> sequences. Web communication is in bytes. You need to (know / assume /
>> be
>> able to extract / guess) the input encoding. You need to encode your
>> outp
Steven D'Aprano wrote:
You need to understand the difference between characters and bytes.
http://www.joelonsoftware.com/articles/Unicode.html
is also a good resource.
Thanks for being patient guys, here's what I've done:
astr="pound sign"
asym=" \u00A3"
afile=open("myfile", mode='w')
afil
On Thu, 12 May 2011 03:31:18 +0100, MRAB wrote:
>> Another question... in mail I'm receiving many small blocks that look
>> like sprites with four small hex codes, scattered about the mail...
>> mostly punctuation, maybe? ... guessing, are these unicode code points,
>> and if so what is the best w
On 12/05/2011 02:22, harrismh777 wrote:
John Machin wrote:
(1) You cannot work without using bytes sequences. Files are byte
sequences. Web communication is in bytes. You need to (know / assume / be
able to extract / guess) the input encoding. You need to encode your
output using an encoding tha
John Machin wrote:
(1) You cannot work without using bytes sequences. Files are byte
sequences. Web communication is in bytes. You need to (know / assume / be
able to extract / guess) the input encoding. You need to encode your
output using an encoding that is expected by the consumer (or use an
On Thu, May 12, 2011 8:51 am, harrismh777 wrote:
> Is it true that if I am
> working without using bytes sequences that I will not need to care about
> the encoding anyway, unless of course I need to specify a unicode code
> point?
Quite the contrary.
(1) You cannot work without using bytes seque
Ian Kelly wrote:
Ian, Benjamin, thanks much.
The `unicode' class was renamed to `str', and a stripped-down version
of the 2.X `str' class was renamed to `bytes'.
... thank you, this is very helpful.
> If I do not specify any code points above ascii 0xFF does any of this
> matter
On Wed, May 11, 2011 at 2:37 PM, harrismh777 wrote:
> hi folks,
> I am puzzled by unicode generally, and within the context of python
> specifically. For one thing, what do we mean that unicode is used in python
> 3.x by default. (I know what default means, I mean, what changed?)
>
> I think p
On Wed, May 11, 2011 at 3:37 PM, harrismh777 wrote:
> hi folks,
> I am puzzled by unicode generally, and within the context of python
> specifically. For one thing, what do we mean that unicode is used in python
> 3.x by default. (I know what default means, I mean, what changed?)
The `unicode'
hi folks,
I am puzzled by unicode generally, and within the context of python
specifically. For one thing, what do we mean that unicode is used in
python 3.x by default. (I know what default means, I mean, what changed?)
I think part of my problem is that I'm spoiled (American, ascii
he
32 matches
Mail list logo