oooooooooooo ooooooooooooo wrote: >> I don't think you've explained the constraint that would make you use >> mysql or not. > > My original idea was using the just the hash as filename, by this way I could > have a direct access. But the customer rejected this and requested to have > part of the long file name (from 11 to 1023 characters). As linux only allows > 256 characters in the path and I could get duplicates with the 256 first > chars, I trim teh real filename to around 200 characters and I add the hash > at the end (plus a couple metadata small fields). > > Yes, there requirements does not makes too much sense, but I've tried to > convince the customer to use just the hash with no luck (seems he does not > understand well what is a hash although I've tried to explain it several > times).
You mentioned that the data can be retrieved from somewhere else. Is some part of this filename a unique key? Do you have to track this relationship anyway - or age/expire content? I'd try to arrange things so the most likely scenario would take the fewest operations. Perhaps a mix of hash+filename would give direct access 99+% of the time and you could move all copies of collisions to a different area. Then you could keep the database mapping the full name to the hashed path but you'd only have to consult it when the open() attempt fails. > That's why I need or a) use mysql or b) do a directory lising. > >> 00/AA/FF/filename > That would make up to 256^3 directory leaves, what is more than 16 Million > ones, due I have around 15M files, I think that this is an excessive number > of directories. I guess that's why squid only uses 16 x 256... -- Les Mikesell lesmikes...@gmail.com _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos