On Tuesday 10 October 2006 09:59, Anders Boström wrote: > >>>>> "KS" == Kern Sibbald writes: > > KS> From the statistics you show, the backup does not appear slow to > KS> me. The reason you might think it is slow is because you are > KS> comparing apples and oranges. > > KS> On the one hand, you measure the time to to a non-compressed tar > KS> on a local machine sending the output down an extremely hi-speed > bit bucket. > > KS> On the other hand, you measure the time of Bacula using > KS> compression sending real data to another process via TCP/IP (even > KS> though it might be on the same machine). > > KS> To do a better comparison, you could run tar including the z > KS> option so that it does compression. In addition, you should send > KS> the output of tar across the network and write it to either a file > KS> or a tape (whatever Bacula is using). > > You don't seem to have seen my data, so I state it again: > > bacula backup without SW compression: 1 hour 45 mins 2 secs > bacula backup with SW compression: 2 hours 42 mins 11 secs > local tar on the fileserver*: 53 mins 3 secs > > * time /bin/sh -c "tar cf - directory | cat >/dev/null" > > bacula is ~2 times slower than the local tar without SW > compression. And, as stated already, the network isn't the limitation > (no TCP retransmission), neither is the backup-server (CPU and disc is > >98% idle during backup).
I did see and carefully think about your data and would have the same comments I made before. > > But, as you point out, the tar should be faster. It doesn't need to > write to net. However, not 2 times faster. The net-load is ~1% (10 > Mbit/s on a GE-network), and *should* not affect the performance in > this case. In performance management, which I did professionally for some 20 years doesn't work with *should*. It works with *careful* testing and hard data. More often than not the hard data is very surprising. > > KS> At that point, providing you are always doing Full backups on > KS> Bacula (and not Incremental of Differential), you will probably > KS> find that the total Bacula time is not terribly greater than tar. > KS> That said, Bacula will amost always be slower than tar because it > KS> does a whole lot more -- in addition to checksuming all the data > KS> Bacula writes to the Volume, which I am not sure tar does, Bacula > > What do you mean bacula is writing to the Volume that tar maybe isn't? I'm not sure that tar checksums the data blocks it writes. Bacula does. > > KS> also interfaces to a database and stores a lot of information > KS> about the job. > > KS> If you want to do additional performance testing you can look at > KS> <bacula-source>/src/version.h. There are various configuration > KS> parameters that you can turn on/off and then re-build Bacula and > KS> measure the performance of particular parts. Performance testing > KS> is a highly evolved science as well as an art, and it is not > KS> always easy. For example, if you are going to do any timing > KS> experiments as you did, you *must* on Unix systems re-run the test > KS> at least 10 times, throw out the first two timings, then take an > KS> average of the remaining 8. If you don't do this, your timing > KS> tests will have no meaning due to the memory cacheing that Unix > KS> does. > > It is a good general rule in benchmarking to re-run every test several > times. And in this case, the fact is that I have, trying to tune the > performance of the server. The times were fairly consistent, with an > error margin of less than 5 minutes (<10%) in all cases. As the > performance difference is in the order of 2 times, I'm quite sure that > the results are correct, bacula *is* much slower than tar. You didn't say how you made the measurements. From what you write I can see that you are aware of some of the problems. However, a 10% variation in the timing seems to me to be quite large, and I am still not convinced you followed the methodology I outlined. To give you an example, one user declared that certain mutex calls in Bacula slowed it down by 10%. When I did my tests, I could not even measure the difference in time between the code with and without the mutexes -- I could say that it was less than 1%, which was roughtly the variation I saw in the timing. My conclusion was that he ran Bacula once with mutexes. He then turned off mutexes and ran it again, and of course, it ran 10% faster. The only thing he was measuring was Linux cacheing. > > I know that I probably should do a lot more testing, but it is very > time-consuming... Trying to tune configuration options can be > important, do you have any suggestion what can affect this? Are there > any known options resulting in bad performance? Though there are probably 10-20 performance pitfalls, the two big problems of performance that I have seen are: - Poorly tuned Catalog database -- insertion of Bacula attributes in the database tends to be slow. There are probably 5 or ten reasons leading to poor DB performance. I'll be working on improving this and documenting it over the next 6-9 months. A good part of what you can do is written in the manual (Catalog Maintenance chapter). The rest appeared on this list within the last month. - A switch (mostly 3Com switches in my experience) that run in half-duplex mode, which slows network traffic down by about a factor of 10. > > Thanks for your time. > > / Anders > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users