Τη Τετάρτη, 5 Ιουνίου 2013 9:03:41 π.μ. UTC+3, ο χρήστης Steven D'Aprano έγραψε:
Nikos wrote:
> > and the displayed filename after 'ls -l' returned was:
> > is -rw-r--r-- 1 nikos nikos 3511233 Jun 4 14:11 \305\365\367\336\
> > \364\357\365\ \311\347\363\357\375.mp3

> > There is no way at all to check the charset used to store it in hdd? It
> > should be UTF-8, but it doesn't look like it. Is there some linxu
> > command or some python command that will print out the actual encoding
> > of '\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3' ?

> The Linux file system does not track encodings. It just stores bytes.
> There is no *reliable* way to guess the encoding that a bunch of bytes  
> came from. If your bytes look like 

> 0x48 0x65 0x6c 0x6c 0x6f 0x20 0x77 0x6f 0x72 0x6c 0x64 0x21

> (ASCII "Hello World!") then you might *guess* that the encoding is ASCII, 
> or UTF-8, or Latin-1. But in general, you can't go from the bytes to the 
> encoding. Encodings are out-of-band information.


Your explanation of encoding/decoding is excellent and iam storing this Steven!
So what i understand now is:

encoding = string -> (some charset used) -> charset bytes
decoding = bytes -> (have to know what charset has been used) -> original string

Have i understtod corrctly, that the *key* to the whole encode/decode process 
is the charset used/has to be used?

string = 'Ευχή του Ιησού.mp3'
abive string in unknown charset bytes = '\305\365\367\336\364\357\365\ 
\311\347\363\357\375.mp3'

We dont know they key(charset) used, but we do know the original form of the 
string, so it occured to me that if we write a python script to decode the 
mojabike bytestream to all available charsets then as some point the original 
string will appear back!


Won't you agree steven? Of course if that is likeley to work i don't know how 
to write it.


Hre is the comamnds you asked.
-----------------------------------------
ni...@superhost.gr [~/www/data/apps]# printf %q\n\n *
100\ Mythoi\ tou\ 
Aiswpou.pdfnnAnekdotologio.exennBattleship.exenn$'\323\352\335                  
                                                                      
\370\357\365 \335\355\341\355 \341\361\351\350\354\374.exe'nnKosmas\ o\ 
Aitwlos\                                                                        
                 -\ Profiteies.pdfnnLuxor\ 
Evolved.exennMonopoly.exenn$'\305\365\367\336 \364\35                           
                                                             7\365 
\311\347\363\357\375.mp3'nnOnline\ Movie\ Player.zipnnO\ Nomos\ tou\ Merfy      
                                                                                
  \ v1-2-3.zipnnOrthodoxo\ Imerologio.exennPac-Man.exennScrabble.exennTo\ 1o\ 
mou\                                                                            
             vivlio\ gia\ to\ skaki.pdfnnVivlos\ gia\ Atheofovous.pdfnnV-Radio\ 
v2.4.msinnni
                                                                                
        ni...@superhost.gr [~/www/data/apps]# ls -b *
100\ Mythoi\ tou\ Aiswpou.pdf*                                            
Online\ Movie\ Player.zip*
Anekdotologio.exe*                                                        O\ 
Nomos\ tou\ Merfy\ v1-2-3.zip
Battleship.exe                                                            
Orthodoxo\ Imerologio.exe*
\323\352\335\370\357\365\ \335\355\341\355\ \341\361\351\350\354\374.exe  
Pac-Man.exe
Kosmas\ o\ Aitwlos\ -\ Profiteies.pdf*                                    
Scrabble.exe
Luxor\ Evolved.exe                                                        To\ 
1o\ mou\ vivlio\ gia\ to\ skaki.pdf*
Monopoly.exe                                                              
Vivlos\ gia\ Atheofovous.pdf*
\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3                  
V-Radio\ v2.4.msi
ni...@superhost.gr [~/www/data/apps]#
-------------------------------

I uploaded via FileZilla the files with english chars and then reanmes from 
CentOS, i did that to avoid renaming them from within my Win8. I though it was 
betetr to rename form within linux itself.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to