On Sat, 2007-09-29 at 13:15 -0700, Ross Boylan wrote:
> On Fri, 2007-09-28 at 08:46 +0200, Arno Lehmann wrote:
> > Hello,
> > 
> > 27.09.2007 22:47,, Ross Boylan wrote::
> > > On Thu, 2007-09-27 at 09:19 +0200, Arno Lehmann wrote:
> > >> Hi,
> > >>
> > >> 27.09.2007 01:17,, Ross Boylan wrote::
> > >>> I've been having really slow backups (13 hours) when I backup a large
> > >>> mail spool.  I've attached a run report.  There are about 1.4M files
> > >>> with a compressed size of 4G.  I get much better throughput (e.g.,
> > >>> 2,000KB/s vs 86KB/s for this job!) with other jobs.
> > >> 2MB/s is still not especially fast for a backup to disk, I think. So 
> > >> your storage disk might also be a factor here.
.....
> > vmstat during a backup would be a good next step in this case, I think.
> > 
> 
> Here are the results of a test job.  The first vmstat was shortly after
> I started the job
> # vmstat 15
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
> id wa
>  1  2   7460  50760 204964 667288    0    0    43    32  197   15 18  5
> 75  2
>  1  1   6852  51476 195492 675524   28    0  1790   358  549 1876 20  6
> 36 38
>  0  2   6852  51484 189332 682612    0    0  1048   416  470 1321 12  4
> 41 43
>  2  0   6852  52508 187344 685328    0    0   303   353  485 1369 16  4
> 68 12
>  1  0   6852  52108 187352 685464    0    0     1   144  468 1987 12  4
> 84  0
> 
> Sorry for the bad wrapping.  This clearly shows about 40% of the CPU
> time spent in IO wait during the backup.  Another 40% is idle.  I'm not
> sure if the reports are being thrown off by the fact that I have 2
> virtual CPU's (not really: it's P4 with hyperthreading).  If that's the
> case, the 40% might really mean 80%.
> 
> During the run I observed little CPU or memory useage above where I was
> before it.  None of the bacula daemons, postgres or bzip got anywhere
> near the top of my cpu use list (using ksysguard).
> 
> A second run went much faster: 14 seconds (1721.6 KB/s) vs 64 seconds
> (376.6 KB/s) the first time.  Both are much better than I got with my
> original, bigger jobs.  It was so quick I think vmstat missed it
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
> id wa
>  1  0   6852  56496 184148 683932    0    0    43    32  197   19 18  5
> 75  2
>  3  0   6852  56016 178604 690024    0  113     0   429  524 3499 35 10
> 55  0
>  2  0   6852  51988 172476 701556    0    0     1  2023  418 3827 33 11
> 55  1
> 
> It looks as if the 2nd run only hit the cache, not the disk, while
> reading the directory (bi is very low)--if I understand the output,
> which is a big if.

Here are some more stats, systematically varying the software.  I'll
just give the total backup time, starting with the 2 reported above:
64
14
Upgrade to Postgresql 8.2 from 8.1
41
13
upgrade to bacula 2.2.4
13
12
switch to a new directory for source of backup
old one has 1,606 files = 24MB copressed
new one has 4,496 files = 27MB
92
22

In the slow cases vmstat shows lots of blocks in and major (40%) CPU
time in iowait.

I suspect the relatively good first try time with Postgresql 8.2 was a
result of having some of the disk still in the cache.

Even the best transfer rates were not particularly impressive (1854
kb/s), but the difference between the first and 2nd runs (and the
sensitivity to the number of files) seems indicate that simply finding
and opening the files causes a huge slowdown.

I guess it's not surprising that hitting the cache is faster than
hitting the disk, but the very slow speed of the disk is striking.

The mount for that partition is
/dev/evms/CyrusSpool /var/spool/cyrus ext3    rw,noatime 0       2
The partition is sitting on top of a lot of layers, since it's from an
LVM container, with evms on top of that.

Probably trying with the catalog on a different physical disk would be a
good idea.

Ross

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to