Dan Guido wrote:
Hi Diez,

The source of the string literals is ConfigParser, so I can't just
mark them with an 'r'.

config =onfigParser.RawConfigParser()
config.read(filename)
crazyfilepath =onfig.get(name, "ImagePath")
normalfilepath =ormalize_path(crazyfilepath)

The ultimate origin of the strings is the _winreg function. Here I
also can't mark them with an 'r'.

regkey =penKey(HKEY_LOCAL_MACHINE,
"SYSTEM\\CurrentControlSet\\Services\\" + name)
crazyimagepath =ueryValueEx(regkey, "ImagePath")[0]
CloseKey(key)

--
Dan Guido



On Wed, Oct 21, 2009 at 2:34 PM, Diez B. Roggisch <de...@nospam.web.de> wrote:
Dan Guido wrote:

I'm trying to write a few methods that normalize Windows file paths.
I've gotten it to work in 99% of the cases, but it seems like my code
still chokes on '\x'. I've pasted my code below, can someone help me
figure out a better way to write this? This seems overly complicated
for such a simple problem...


# returns normalized filepath with arguments removed
def remove_arguments(filepath):
#print "removing args from: " + filepath
(head, tail) =s.path.split(filepath)
pathext =s.environ['PATHEXT'].split(";")

while(tail !='):
#print "trying: " + os.path.join(head,tail)

# does it just work?
if os.path.isfile(os.path.join(head, tail)):
#print "it just worked"
return os.path.join(head, tail)

# try every extension
for ext in pathext:
if os.path.isfile(os.path.join(head, tail) + ext):
return os.path.join(head, tail) + ext

# remove the last word, try again
tail =ail.split()[:-1]
tail = ".join(tail)

return None

escape_dict=\a':r'\a',
           '\b':r'\b',
           '\c':r'\c',
           '\f':r'\f',
           '\n':r'\n',
           '\r':r'\r',
           '\t':r'\t',
           '\v':r'\v',
           '\'':r'\'',
           #'\"':r'\"',
           '\0':r'\0',
           '\1':r'\1',
           '\2':r'\2',
           '\3':r'\3',
           '\4':r'\4',
           '\5':r'\5',
           '\6':r'\6',
           '\7':r'\a', #i have no idea
           '\8':r'\8',
           '\9':r'\9'}

def raw(text):
"""Returns a raw string representation of text"""
new_string=
for char in text:
try:
new_string+=cape_dict[char]
#print "escaped"
except KeyError:
new_string+=ar
#print "keyerror"
#print new_string
return new_string

# returns the normalized path to a file if it exists
# returns None if it doesn't exist
def normalize_path(path):
#print "not normal: " + path

# make sure it's not blank
if(path =""):
return None

# get rid of mistakenly escaped bytes
path =aw(path)
#print "step1: " + path

# remove quotes
path =ath.replace('"', '')
#print "step2: " + path

#convert to lowercase
lower =ath.lower()
#print "step3: " + lower

# expand all the normally formed environ variables
expanded =s.path.expandvars(lower)
#print "step4: " + expanded

# chop off \??\
if expanded[:4] ="\\??\\":
expanded =xpanded[4:]
#print "step5: " + expanded

# strip a leading '/'
if expanded[:1] ="\\":
expanded =xpanded[1:]
#print "step7: " + expanded

systemroot =s.environ['SYSTEMROOT']

# sometimes systemroot won't have %
r =e.compile('systemroot', re.IGNORECASE)
expanded =.sub(systemroot, expanded)
#print "step8: " + expanded

# prepend the %systemroot% if its missing
if expanded[:8] ="system32" or "syswow64":
expanded =s.path.join(systemroot, expanded)
#print "step9: " + expanded

stripped =emove_arguments(expanded.lower())

# just in case you're running as LUA
# this is a race condition but you can suck it
if(stripped):
if os.access(stripped, os.R_OK):
return stripped

return None

def test_normalize():
test1 =\??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys"
test2 =C:\WINDOWS\system32\msdtc.exe"
test3 =%SystemRoot%\system32\svchost.exe -k netsvcs"
test4 =\SystemRoot\System32\drivers\vga.sys"
test5 =system32\DRIVERS\compbatt.sys"
test6 =C:\Program Files\ABC\DEC Windows Services\Client Services.exe"
test7 =c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe"
test8 =C:\WINDOWS\system32\svchost -k dcomlaunch"
test9 ="
test10 =SysWow64\drivers\AsIO.sys"
test11 =\SystemRoot\system32\DRIVERS\amdsbs.sys"
test12 =C:\windows\system32\xeuwhatever.sys" #this breaks everything
If I'm getting this right, what you try to do is to convert characters that
come from string-literal escape-codes to their literal representation. Why?

A simple

 test12 ="C:\windows\system32\xeuwhatever.sys"

is all you need - note the leading r. Then

 test12[2] ="\\" # need escape on the right because of backslashes at end
of raw-string-literals rule.

holds.

Diez
--
http://mail.python.org/mailman/listinfo/python-list


Your first problem is that you're mixing tabs and spaces in your source code. Dangerous and confusing, not to mention an error in Python 3.x

The second problem is that your test_normalize() is called with a bunch of invalid literals. Backslashes in quote literals need to be escaped, or you need to use the raw form of literal. Now this may have nothing to do with the data you get from ConfigParser or QueryValueEx(), but it sure makes testing confusing.

The third problem is your raw() function. It seems like you're trying to somehow build a version of the string that would pass muster as a literal string. Unless you're trying to generate Python source code, I can't see where this can possibly help. Perhaps you're just trying to compensate for the second problem? If the actual strings are coming from the registry, you won't need any of this complexity.

I don't see what your original problem is. Is it to take a registry entry that contains both filepath and some other data, and separate out just the filepath portion?

Maybe it'd be best if you could show us your config file, or at least the ImagePath portion of it (with some context). Then let's look at the actual value of
crazyfilepath:

print crazyfilepath
print repr(crazyfilepath)


Or you could tell us what registry entry is giving you grief.  And maybe 
somebody could see what to do about it.

DaveA




--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to