Hi.

I can't find any mention of this elsewhere, so I would like to report this as a bug - or may be an expected result, just not documented anywhere (?).

I have found that a large exclude-from=file negatively affects the speed of tar.

This dataset consists of about 2,4 million files, and is 468 GiB in size.

To test this, I made up a exclude-file of entries which would not match anything in the dataset.

With 100 exclude-lines, I got a throughput of 461MiB/s.
With 50 exclude-lines, I got a throughput of 552MiB/s.
With 1 exclude-line, I got a throughput of 685MiB/s.
Without any exclude-from, I got 691MiB/s.

The result are not linear with the numbers of excludes, but it shows that each exclude-line makes a negative impact on performance.


Does anyone have an explanation to this behaviour? We need to exclude some of our data before writing it out to tape, but we don't want to loose to much speed.

Reply via email to