Re: Making safe file names

2013-05-28 Thread Grant Edwards
On 2013-05-28, Albert van der Horst wrote: >> There's also the Windows device name hole. There may be trouble with >> artists named 'COM4', 'CLOCK$', 'Con', or similar. >> >>http://support.microsoft.com/kb/74496 > > That applies to MS-DOS names. God forbid that this still holds on > more modern M

Re: Making safe file names

2013-05-28 Thread Chris Angelico
On Tue, May 28, 2013 at 11:44 PM, Albert van der Horst wrote: > In article , > Neil Hodgson wrote: >>There's also the Windows device name hole. There may be trouble with >>artists named 'COM4', 'CLOCK$', 'Con', or similar. >> >>http://support.microsoft.com/kb/74496 > > That applies to MS-DOS

Re: Making safe file names

2013-05-28 Thread Albert van der Horst
In article , Neil Hodgson wrote: >Andrew Berg: > >> This is not a Unicode issue since (modern) file systems will happily >accept it. The issue is that certain characters (which are ASCII) are >> not allowed on some file systems: >> \ / : * ? "< > | @ and the NUL character >> The first 9 are n

Re: Making safe file names

2013-05-11 Thread Chris Angelico
On Thu, May 9, 2013 at 1:08 PM, Steven D'Aprano wrote: > I suspect that the only way to be completely ungoogleable would be to > name yourself something common, not something obscure. Say, if you called > yourself "Hard Rock Band", and did hard rock. But then, googling for > "Heavy Metal" alone br

Re: Making safe file names

2013-05-11 Thread Andrew Berg
On 2013.05.08 18:37, Dennis Lee Bieber wrote: > And now you've seen why music players don't show the user the > physical file name, but maintain a database mapping the internal data > (name, artist, track#, album, etc.) to whatever mangled name was needed > to satisfy the file system. Tags ar

Re: Making safe file names

2013-05-09 Thread Tim Chase
On 2013-05-10 12:04, Gregory Ewing wrote: > Roy Smith wrote: > > http://en.wikipedia.org/wiki/The_band > > Nope... googling for "the band" brings that up as the > very first result. > > The Google knows all. You cannot escape The Google... That does it. I'm naming my band "Google". :-) -tkc

Re: Making safe file names

2013-05-09 Thread Gregory Ewing
Roy Smith wrote: In article <518b133b$0$29997$c3e8da3$54964...@news.astraweb.com>, Steven D'Aprano wrote: I suspect that the only way to be completely ungoogleable would be to name yourself something common, not something obscure. http://en.wikipedia.org/wiki/The_band Nope... googling for

Re: Making safe file names

2013-05-09 Thread Roy Smith
In article <518b133b$0$29997$c3e8da3$54964...@news.astraweb.com>, Steven D'Aprano wrote: > I suspect that the only way to be completely ungoogleable would be to > name yourself something common, not something obscure. http://en.wikipedia.org/wiki/The_band -- http://mail.python.org/mailman/lis

Re: Making safe file names

2013-05-08 Thread Steven D'Aprano
On Wed, 08 May 2013 21:11:28 -0500, Andrew Berg wrote: > It's a thing (especially in witch house) to make names with odd glyphs > in order to be harder to find and be more "underground". Very silly. Try > doing searches for these artists with names like these: Challenge accepted. > http://www.la

Re: Making safe file names

2013-05-08 Thread Andrew Berg
On 2013.05.08 19:16, Roy Smith wrote: > Yup. At Songza, we deal with this crap every day. It usually bites us > the worst when trying to do keyword searches. When somebody types in > "Blue Oyster Cult", they really mean "Blue Oyster Cult", and our search > results need to reflect that. Likew

Re: Making safe file names

2013-05-08 Thread Roy Smith
In article <518b00a2$0$29997$c3e8da3$54964...@news.astraweb.com>, Steven D'Aprano wrote: > > When somebody types in > > "Blue Oyster Cult", they really mean "Blue Oyster Cult", > > Surely they really mean Blue Öyster Cult. Yes. The oomlaut was there when I typed it. Who knows what happened

Re: Making safe file names

2013-05-08 Thread Steven D'Aprano
On Wed, 08 May 2013 20:16:25 -0400, Roy Smith wrote: > Yup. At Songza, we deal with this crap every day. It usually bites us > the worst when trying to do keyword searches. When somebody types in > "Blue Oyster Cult", they really mean "Blue Oyster Cult", Surely they really mean Blue Öyster Cu

Re: Making safe file names

2013-05-08 Thread Chris Angelico
On Thu, May 9, 2013 at 10:16 AM, Roy Smith wrote: > Pro-tip, guys. If you want to form a band, and expect people to be able > to find your stuff in a search engine some day, don't play cute with > your name. It's the modern equivalent of names like Catherine Withekay. ChrisA -- http://mail.pyt

Re: Making safe file names

2013-05-08 Thread Roy Smith
In article , Dennis Lee Bieber wrote: > On Tue, 07 May 2013 18:10:25 -0500, Andrew Berg > declaimed the following in > gmane.comp.python.general: > > > None of these would work because I would have no idea which file stores > > data for which artist without writing code to figure it out. If I

Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 22:40, Steven D'Aprano wrote: > There aren't any characters outside of UTF-8 :-) UTF-8 covers the entire > Unicode range, unlike other encodings like Latin-1 or ASCII. You are correct. I'm not sure what I was thinking. >> I don't understand. I have no intention of changing Unicode c

Re: Making safe file names

2013-05-07 Thread Steven D'Aprano
On Wed, 08 May 2013 00:13:20 -0400, Dave Angel wrote: > On 05/07/2013 11:40 PM, Steven D'Aprano wrote: >> >> >> >> These are all Unicode characters too. Unicode is a subset of ASCII, so >> anything which is ASCII is also Unicode. >> >> >> > Typo. You meant Unicode is a superset of ASCII. Da

Re: Making safe file names

2013-05-07 Thread Dave Angel
On 05/07/2013 11:40 PM, Steven D'Aprano wrote: These are all Unicode characters too. Unicode is a subset of ASCII, so anything which is ASCII is also Unicode. Typo. You meant Unicode is a superset of ASCII. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list

Re: Making safe file names

2013-05-07 Thread Dave Angel
On 05/07/2013 10:06 PM, Andrew Berg wrote: On 2013.05.07 20:28, Neil Hodgson wrote: http://support.microsoft.com/kb/74496 http://en.wikipedia.org/wiki/Nul_%28band%29 I can indeed confirm that at least 'nul' cannot be used as a filename. However, I add an extension to the file names to identify

Re: Making safe file names

2013-05-07 Thread Steven D'Aprano
On Tue, 07 May 2013 19:51:24 -0500, Andrew Berg wrote: > On 2013.05.07 19:14, Dave Angel wrote: >> You also need to decide how to handle Unicode characters, since they're >> different for different OS. In Windows on NTFS, filenames are in >> Unicode, while on Unix, filenames are bytes. So on one

Re: Making safe file names

2013-05-07 Thread Roy Smith
In article , Dave Angel wrote: > While we're looking for trouble, there's also case insensitivity. > Unclear if the user cares, but tom and TOM are the same file in most > configurations of NT. OSX, too. -- http://mail.python.org/mailman/listinfo/python-list

Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 20:13, Dave Angel wrote: > So you're comfortable typing arbitrary characters? what about all the > characters that have identical displays in your font? Identification is more important than typing. I can copy and paste into a terminal if necessary. I don't foresee typing out one o

Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 20:45, Dave Angel wrote: > While we're looking for trouble, there's also case insensitivity. > Unclear if the user cares, but tom and TOM are the same file in most > configurations of NT. Artist names on Last.fm cannot differ only in case. This does remind me to make sure to update

Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 20:28, Neil Hodgson wrote: > http://support.microsoft.com/kb/74496 > http://en.wikipedia.org/wiki/Nul_%28band%29 I can indeed confirm that at least 'nul' cannot be used as a filename. However, I add an extension to the file names to identify them as caches. -- CPython 3.3.1 | Windo

Re: Making safe file names

2013-05-07 Thread Dave Angel
On 05/07/2013 09:28 PM, Neil Hodgson wrote: Andrew Berg: This is not a Unicode issue since (modern) file systems will happily accept it. The issue is that certain characters (which are ASCII) are not allowed on some file systems: \ / : * ? "< > | @ and the NUL character The first 9 are not

Re: Making safe file names

2013-05-07 Thread Neil Hodgson
Andrew Berg: This is not a Unicode issue since (modern) file systems will happily accept it. The issue is that certain characters (which are ASCII) are not allowed on some file systems: \ / : * ? "< > | @ and the NUL character The first 9 are not allowed on NTFS, the @ is not allowed on ext

Re: Making safe file names

2013-05-07 Thread Dave Angel
On 05/07/2013 08:51 PM, Andrew Berg wrote: On 2013.05.07 19:14, Dave Angel wrote: You also need to decide how to handle Unicode characters, since they're different for different OS. In Windows on NTFS, filenames are in Unicode, while on Unix, filenames are bytes. So on one of those, you will b

Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 19:14, Dave Angel wrote: > You also need to decide how to handle Unicode characters, since they're > different for different OS. In Windows on NTFS, filenames are in > Unicode, while on Unix, filenames are bytes. So on one of those, you > will be encoding/decoding if your code is

Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 17:37, Jens Thoms Toerring wrote: > You > could e.g. replace all characters not allowed by the file > system by their hexidecimal (ASCII) values, preceeded by a > '%" (so '/' would be changed to '%2F', and also encode a '%' > itself in a name by '%25'). Then you have a well-defined >

Re: Making safe file names

2013-05-07 Thread Roy Smith
In article , Dave Angel wrote: > On 05/07/2013 03:58 PM, Andrew Berg wrote: > > Currently, I keep Last.fm artist data caches to avoid unnecessary API calls > > and have been naming the files using the artist name. However, > > artist names can have characters that are not allowed in file names

Re: Making safe file names

2013-05-07 Thread Dave Angel
On 05/07/2013 03:58 PM, Andrew Berg wrote: Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and have been naming the files using the artist name. However, artist names can have characters that are not allowed in file names for most file systems (e.g., C/A/T has forwar

Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 17:01, Terry Jan Reedy wrote: > Sounds like you want something like the html escape or urlencode > functions, which serve the same purpose of encoding special chars. > Rather than invent a new tranformation, you could use the same scheme > used for html entities. (Sorry, I forget t

Re: Making safe file names

2013-05-07 Thread Andrew Berg
On 2013.05.07 17:18, Fábio Santos wrote: > I suggest Base64. b64encode > (http://docs.python.org/2/library/base64.html#base64.b64encode) and > b64decode take an argument which allows you to eliminate the pesky "/" > character. It's reversible and simple. > > More suggestions: how about a hash? Or

Re: Making safe file names

2013-05-07 Thread Chris Angelico
On Wed, May 8, 2013 at 8:18 AM, Fábio Santos wrote: > I suggest Base64. b64encode > (http://docs.python.org/2/library/base64.html#base64.b64encode) and > b64decode take an argument which allows you to eliminate the pesky "/" > character. It's reversible and simple. But it doesn't look anything li

Re: Making safe file names

2013-05-07 Thread Jens Thoms Toerring
Andrew Berg wrote: > Currently, I keep Last.fm artist data caches to avoid unnecessary API calls > and have been naming the files using the artist name. However, artist names > can have characters that are not allowed in file names for most file systems > (e.g., C/A/T has forward slashes). Are the

Re: Making safe file names

2013-05-07 Thread Dan Stromberg
On 5/7/13, Andrew Berg wrote: > Currently, I keep Last.fm artist data caches to avoid unnecessary API calls > and have been naming the files using the artist name. However, > artist names can have characters that are not allowed in file names for most > file systems (e.g., C/A/T has forward slashe

Re: Making safe file names

2013-05-07 Thread MRAB
On 07/05/2013 20:58, Andrew Berg wrote: Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and have been naming the files using the artist name. However, artist names can have characters that are not allowed in file names for most file systems (e.g., C/A/T has forward s

Re: Making safe file names

2013-05-07 Thread Fábio Santos
I suggest Base64. b64encode (http://docs.python.org/2/library/base64.html#base64.b64encode) and b64decode take an argument which allows you to eliminate the pesky "/" character. It's reversible and simple. More suggestions: how about a hash? Or just use IDs from the database? On Tue, May 7, 2013

Re: Making safe file names

2013-05-07 Thread Terry Jan Reedy
On 5/7/2013 3:58 PM, Andrew Berg wrote: Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and have been naming the files using the artist name. However, artist names can have characters that are not allowed in file names for most file systems (e.g., C/A/T has forward s