> >>>> s = 'α' 
> >>>> s.encode('utf-8') 
> > b'\xce\xb1' 

'b' stands for binary right? 
 b'\xce\xb1' = we are looking at a byte in a hexadecimal format? 
if yes how could we see it in binary and decimal represenation? 
  
> > I see that the encoding of this char takes 2 bytes. But why two exactly? 
> > How do i calculate how many bits are needed to store this char into bytes? 
  
> Because utf-8 takes 1 to 4 bytes to encode characters 

Since 2^8 = 256, utf-8 should store the first 256 chars of unicode charset 
using 1 byte. 

Also Since 2^16 = 65535, utf-8 should store the first 65535 chars of unicode 
charset using 2 bytes and so on. 

But i know that this is not the case. 
But i dont understand why. 


> >>>> s = 'a' 
> >>>> s.encode('utf-8') 
> > b'a' 
> utf-8 takes ASCII as it is, as 1 byte. They are the same 

EBCDIC and ASCII and Unicode are charactet sets, correct? 

iso-8859-1, iso-8859-7, utf-8, utf-16, utf-32 and so on are encoding methods, 
right?
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to