Τη Πέμπτη, 6 Ιουνίου 2013 10:42:25 μ.μ. UTC+3, ο χρήστης MRAB έγραψε:
> On 06/06/2013 19:13, οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½
>       wrote:
> 
>     
>     
>       οΏ½οΏ½ οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½, 6 οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ 2013 3:50:52 
> οΏ½.οΏ½. UTC+3, οΏ½ οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ MRAB οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½:
> 
> > If you're happy for that change to happen, then go ahead.
> 
> I have made some modifications to the code you provided me but i think 
> something that doesnt accur to me needs fixing.
> 
> 
> for example i switched:
> 
> # Give the path as a bytestring so that we'll get the filenames as 
> bytestrings 
> path = b"/home/nikos/public_html/data/apps/" 
> 
> # Walk through the files. 
> for root, dirs, files in os.walk( path ): 
>         for filename in files: 
> 
> to:
> 
> # Give the path as a bytestring so that we'll get the filenames as bytestrings
> path = os.listdir( b'/home/nikos/public_html/data/apps/' )
> 
>     
>     os.listdir returns a list of the names of the objects in the given
>     directory.
> 
>     
> 
>     
>       # iterate over all filenames in the apps directory
> 
>     
>     Exactly, all the names.
> 
>     
> 
>     
>       for fullpath in path
>       # Grabbing just the filename from path
> 
>     
>     The name is a bytestring. Note, name, NOT full path.
> 
>     
> 
>     The following line will fail because the name is a bytestring,
>     and you can't mix bytestrings with Unicode strings:
> 
>     
>               filename = fullpath.replace( 
> '/home/nikos/public_html/data/apps/', '' )
>     
>     οΏ½ οΏ½ 
> οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½
>  ^ bytestringοΏ½οΏ½οΏ½
>     οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ ^ Unicode 
> stringοΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ οΏ½
>     
> οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½
>  ^ Unicode string
> 
>     
>       I dont know if it has the same effect:
> Here is the the whole snippet:
> 
> 
> =============================================
> # Give the path as a bytestring so that we'll get the filenames as bytestrings
> path = os.listdir( b'/home/nikos/public_html/data/apps/' )
> 
> # iterate over all filenames in the apps directory
> for fullpath in path
>       # Grabbing just the filename from path
>       filename = fullpath.replace( '/home/nikos/public_html/data/apps/', '' )
>       try: 
>               # Is this name encoded in utf-8? 
>               filename.decode('utf-8') 
>       except UnicodeDecodeError: 
>               # Decoding from UTF-8 failed, which means that the name is not 
> valid utf-8
>                       
>               # It appears that this filename is encoded in greek-iso, so 
> decode from that and re-encode to utf-8
>               new_filename = filename.decode('iso-8859-7').encode('utf-8') 
>                       
>               # rename filename form greek bytestream-> utf-8 bytestream
>               old_path = os.path.join(root, filename) 
>               new_path = os.path.join(root, new_filename)
>               os.rename( old_path, new_path )
> 
> 
> #============================================================
> # Compute a set of current fullpaths 
> path = os.listdir( '/home/nikos/public_html/data/apps/' )
> 
> # Load'em
> for fullpath in path:
>       try:
>               # Check the presence of a file against the database and insert 
> if it doesn't exist
>               cur.execute('''SELECT url FROM files WHERE url = %s''', 
> (fullpath,) )
>               data = cur.fetchone()        #URL is unique, so should only be 
> one
>               
>               if not data:
>                       # First time for file; primary key is automatic, hit is 
> defaulted 
>                       cur.execute('''INSERT INTO files (url, host, lastvisit) 
> VALUES (%s, %s, %s)''', (fullpath, host, lastvisit) )
>       except pymysql.ProgrammingError as e:
>               print( repr(e) )
> ==================================================================
> 
> The error is:
> [Thu Jun 06 21:10:23 2013] [error] [client 79.103.41.173]   File "files.py", 
> line 64
> [Thu Jun 06 21:10:23 2013] [error] [client 79.103.41.173]     for fullpath in 
> path
> [Thu Jun 06 21:10:23 2013] [error] [client 79.103.41.173]                     
>    ^
> [Thu Jun 06 21:10:23 2013] [error] [client 79.103.41.173] SyntaxError: 
> invalid syntax
> 
> 
> Doesn't os.listdir( ...) returns a list with all filenames?
> 
> But then again when replacing take place to shert the fullpath to just the 
> filane i think it doesn't not work because the os.listdir was opened as 
> bytestring and not as a string....
> 
> What am i doing wrong?
> 
>     
>     You're changing things without checking what they do!

Ah yes, it retruens filenames, not path/to/filenames



#========================================================
# Give the path as a bytestring so that we'll get the filenames as bytestrings
path = os.listdir( b'/home/nikos/public_html/data/apps/' )

# iterate over all filenames in the apps directory
for filename in path:
        # Grabbing just the filename from path
        try: 
                # Is this name encoded in utf-8? 
                filename.decode('utf-8') 
        except UnicodeDecodeError: 
                # Decoding from UTF-8 failed, which means that the name is not 
valid utf-8
                        
                # It appears that this filename is encoded in greek-iso, so 
decode from that and re-encode to utf-8
                new_filename = filename.decode('iso-8859-7').encode('utf-8') 
                        
                # rename filename form greek bytestream-> utf-8 bytestream
                old_path = os.path.join(root, filename) 
                new_path = os.path.join(root, new_filename)
                os.rename( old_path, new_path )


#========================================================
# Compute a set of current fullpaths 
path = os.listdir( '/home/nikos/public_html/data/apps/' )

# Load'em
for filename in path:
        try:
                # Check the presence of a file against the database and insert 
if it doesn't exist
                cur.execute('''SELECT url FROM files WHERE url = %s''', 
(filename,) )
                data = cur.fetchone()        #URL is unique, so should only be 
one
                
                if not data:
                        # First time for file; primary key is automatic, hit is 
defaulted 
                        cur.execute('''INSERT INTO files (url, host, lastvisit) 
VALUES (%s, %s, %s)''', (filename, host, lastvisit) )
        except pymysql.ProgrammingError as e:
                print( repr(e) )


# Delete spurious 
cur.execute('''SELECT url FROM files''')
data = cur.fetchall()

for fullpath in data:
        if fullpath not in "What should be written here in place of ditched set"
                cur.execute('''DELETE FROM files WHERE url = %s''', (fullpath,) 
)

=============================

a) Is it correct that the first time i open os.listdir() as binary to grab the 
fileenames as bytestring and the 2nd normally to grab the filanems as unicode 
strings?

b) My spurious procedure is messed up now that i ditch the set fullpaths()
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to