Min Yuan wrote:
>>>> We have a directory on Redhat 6.2 with 500, 000
>>>> files. In our code we open and read the directory
>>>> and for each entry in the directory we use lstat()
>>>> to check for some information. The whole scanning
>>>> takes more than eight hours which is terribly long.
>>>>
>>>> Is there any way we could reduce this length of
>>>> time? If the answer is NO, then is there any official
>>>> documents about it and where can we find it?
>>>
>>> Yes. Stop putting so many files into a single
>>> directory.
>>
>> Besides this, there is no other ways? This is not the
>> right solution for us. Because we are developing an
>> application which needs to handle large directories
>> on client sites and these 500000 files have to be
>> put in a single directory.
>
>Directory search time becomes linear in the number
>of entries once the size exceeds any directory name
>cache capacity. Repeated directory searches then
>becomes quadratic in the number of entries. 500,000 ^
>2 isn't a small number ...
Min,
Julie's right.
But it should be easy for you to write your program to put files
with names starting with 'a' in a directory 'a', 'b' in a directory
'b', and so on. That would help a lot. It'll help even more if you
can do two levels. It's really the smartest way to deal with the problem.
If you *really* can't do this, see
http://marc.theaimsgroup.com/?l=linux-kernel&m=98681307416575&w=2
people are working on a patch that may interest you.
- Dan
_______________________________________________
Redhat-devel-list mailing list
[EMAIL PROTECTED]
https://listman.redhat.com/mailman/listinfo/redhat-devel-list