On Tue, 9 Nov 2004, Dominique Devienne <[EMAIL PROTECTED]> wrote:
>> From: Stefan Bodewig [mailto:[EMAIL PROTECTED]
>> > I'm sure I'm probably not understanding your change correctly, so
>> > I guess I need an explanations for how your change works.
>> 
>> Basically, if I want to be able to keep some stuff in the target
>> directory, I'll need a way to exclude them from the purge process
>> and thus using the full power of DirectoryScanner seemed to be the
>> natural choice.
> 
> So you plan on adding a patternset or an implicit/explicit fileset
> for things that should not be touched?

Initially I planned a stripped down subclass of AbstractFileSet as a
nested element of <sync> for target files to delete.  Default would be
to delete everything, but you could protect stuff using <exclude> or
selectors.

> I myself thought about this issue, and thought that the <zip>
> mechanism of being able to specify a mapper or destination directory
> (kind of like zipfileset's prefix) for each input (source) fileset
> would be another way to look at the same pb.

No, I don't think it would address all use-cases.  Something like
generated .class and .java files from JSPs.  You want to copy over new
JSPs but don't want to delete the generated files at all.  Another
possibility reported on [EMAIL PROTECTED] has been server generated logfiles
that shouldn't get removed at all.

I understand what use case you are coming from, but it really is
different from what I'm after.

>> > I believe toArray doesn't care if the array provided is larger
>> > than needed.
> 
> This is just one data point with JDK 1.4.2, but toArray basically
> puts a null element for the one past last element if the array is
> bigger.

It better should since the javadocs of JDK 1.2 describe exactly that
behavior - therefore I've changed the code already.

>> Most time spent in DirectoryScanner is file system scanning AFAIU.
> 
> Which is precisely why I'm worried! We need to scan everything in
> any case, as you rightly point out, so we end up probably doing a
> linear search against all the excludes for every file of the
> dest. dir., instead of a lookup in a set.

Yes, you are correct.

> Like I said, I think the performance will suffer a lot for large
> sync, and I'd rather not add this feature if it can't be implemented
> more efficiently.

Let's discuss the alternatives, then.  You will notice that I didn't
merge any changes to the 1.6 branch at all, and we can always easily
roll back the changes.

We will most likely need pattern matching for my use case.  Things
like "keep the files named *.log" or "keep everything in foo/bar/
recursively".  For this to work we'll need to more or less duplicate
what DirectoryScanner does but could be faster for all the constant
file names by using a set lookup.  I'm absolutely willing to drop
selector support for files to keep.

Hmm, maybe we can speed up DirectoryScanner to make it recognize
"constant patterns", store them in a set and consult this set before
doing a linear search on the real patterns?  This would address your
concerns and provide an easy implementation for what I wanted to do
with <sync> and speed up DirectoryScanner for large-ish sets of
constant filenames at the same time.

> Maybe DirectoryScanner is smarter than I think it is!?

No, but maybe DirectoryScanner should be smarter than you think.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to