Jorgen Grahn wrote: > On Mon, 24 Nov 2008 00:44:45 -0500, r0g <[EMAIL PROTECTED]> wrote: >> Hi there, >> >> I'm trying to validate some user input which is for the most part simple >> regexery however I would like to check filenames and I would like this >> code to be multiplatform. >> >> I had hoped the os module would have a function that would tell me if a >> proposed filename would be valid on the host system but it seems not. I >> have considered whitelisting but it seems a bit unfair to make the rest >> of the world suffer the naming restrictions of windows. Moreover it >> seems both inelegant and hard work to research the valid file/directory >> naming conventions of every platform that this app could conceivably run >> on and write regex's for all of them so... >> >> I'm tempted to go the witch dunking route, stick it in an open() between >> a Try: & Except: and see if it floats. However... >> >> Although it's a desktop (not internet facing) app I'm a little squeamish >> piping raw user input into a filesystem function like that and this app >> will be dealing with some particularly sensitive data so I want to be >> careful and minimize exposure where practical. > > Take the Unix 'ls' command (or MS-DOS 'dir'). That's two programs > which let users pipe raw input into the filesystem functions, and they > certainly have handled some very sensitive data over the years. > >> Has programming PHP and Web stuff for years made me overly paranoid >> about this [...] > > Yes. ;-) > > Please explain one thing: what are you looking for? It's not > "accesses a file outside the user's home directory", "accesses an > infinite file like /dev/zero" or something like that, or you would > have said so. Nor seems the "user" input come from some other user > than the one your program is running as, nor from some input source > which the user cannot be held responsible for. > > Seems to me you simply want to know beforehand that the reading will > work. But you can never check that! You can stat(2) the file, or > open-and-close it -- and then a microsecond later, someone deletes the > file, or replaces it with another one, or write-protects it, or mounts > a file system on top of its directory, or drops a nuke over the city, > or ... > > Two more notes: > > - os.open is not like os.system. If os.open ends up doing > anything other than trying to open the file corresponding to the > string you feed it, it's Python's fault, not yours. > > Compare with a language (does Perl allow this?) where if the string > is "rm -rf /|", open will run "rm -rf /" and start reading its output. > *That* interface would have been > > - if the OS ends up doing something different when calling open(2) or > creat(2) or whatever using that string, it's the OSes fault, not > yours. > > Or am I missing something? > > /Jorgen >
No Jorgen, that's exactly what I needed to know i.e. that sending unfiltered text to open() is not negligent or likely to allow any badness to occur. As far as what I was looking for: I was not looking for anything in particular as I couldn't think of any specific cases where this could be a problem however... my background is websites (where input sanitization is rule number one) and some of the web exploits I've learned to mitigate over the years aren't ones I would have necessarily figured out for myself i.e. CSRF So I thought I'd ask you guys in case there's anything I haven't considered that I should consider! Thankfully it seems I don't have too much to worry about :-) The only situation where I can forsee potential for mischief is if the program, or part thereof, is running as a more privileged user than the user it is accepting input from. Thankfully I don't think that will be necessary in the prog I'm working on right now as I don't need packet capture / low numbered ports etc. Thanks for your answer and thanks to everybody else for all their comments too. Roger. -- http://mail.python.org/mailman/listinfo/python-list