On 3/3/2019 12:52 PM, Peter Milesson wrote:
Hi folks,

I did some testing during the weekend.

  * When backing up a huge file (> 10GB), the Windows transfer rate is
    comparable to the Linux transfer rate (32 Mb/s).


Yes. This is why I suspect the problem lies in NTFS, not in network transfer. I also suspect that it lies in directory traversal and/or in opening a file for reading. The actual reading of an already open file and network send seem to be fine, as shown by the normal speeds seen with large files.

My first thought is that Windows' complex ACLs, due to inheritance, maintaining audit metadata, and maintaining "short names" for each file and directory, make directory traversal and file opening quite slow, as compared to, for example,  ext4. This is evidenced by the inclusion of robocopy in (most?) Windows versions. Robocopy is a CLI utility for copying larger numbers of files much more quickly than Windows Explorer. Robocopy is multi-threaded when using the /MT flag, and when running robocopy in multi-threaded mode it is dramatically faster at copying small files. This speed up is almost certainly due to concurrent ACL checks and directory traversal, rather than buffering, read-ahead, or anything to do with actual data transfer. For this reason, I believe that the only way to dramatically improve backup speeds in Windows is to multi-thread bacula-fd. Granted, that is not an easy fix.


  * Setting the file daemon buffer to 32k on the Windows server seemed
    to help, but not very much.


I saw a decent improvement with older Windows clients (Windows 7). It might not apply to server versions or Windows 10.


  * The Windows backup transfer rate is still a lot slower than
    expected (22 Mb/s) for a full backup of 270 GB (298000 files),
    whereas the Linux backup is  35 Mb/s for a full backup of 783 GB
    (461500 files).

The next thing I'm going to try is moving all overhead off of the Windows server. It will take a while to move things around, as I need to get a new server for this.

Best regards,

Peter

On 01.03.2019 17:27, Peter Milesson wrote:
Hi Josh,

With the current settings, last access updates where disabled for Windows, and neither ATIME nor NOATIME for the Linux server. So in the current setup, the Linux server was at a disadvantage. I changed the network buffer to 32k on the Windows server, and I'll be wiser tomorrow, if it helped.

Thanks for the advice.

Best regards,

Peter

Dne 01.03.2019 v 16:40 Josh Fisher napsal(a):
I also attribute this to Windows inefficiencies, particularly in NTFS handling of small files.However, I am not sure that those inefficiencies explain a greater than 50% performance hit. Two quick changes come to mind that may help.

1. Change MaximumNetworkBufferSize to 32k in bacula-fd.conf. Windows has been known to dislike the default 64k network buffer size.

2. Set the DWORD value NtfsDisableLastAccessUpdate in HKLM/system/CurrentControlSet/Control/FileSystem to nonzero to prevent a write to the disk each time a file is accessed. It is the NTFS equivalent of the NOATIME option used in ext4 and other Linux filesystems. For Windows file servers holding lots and lots of small files, those last access updates add up to quite a lot of disk activity. Generally, last access time is not needed or all that useful. In particular, if NOATIME is being used on the Linux client and NtfsDisableLastAccessUpdate = 0 on the Windows client, then you are not comparing apples to apples.


On 3/1/2019 6:48 AM, Kern Sibbald wrote:
Hello,

I have noticed similar things.  I have always attributed the slower
speed on Windows due to the fact that Microsoft hired the best students
from the best schools but most of them knew nothing about programming
and programming history (in particular Unix), thus these geniuses
re-invented the OS wheel in designing and building a monolithic
operating system that took about 10 times as much code as it took Unix
(and subsequently Linux).  To me it is not surprising that Windows had
more bugs than Linux (despite huge advances, it probably still has more
bugs).  In any case, programming Windows for a Linux programmer is a
nightmare -- 10 times harder to do almost anything, because there are
far more OS calls; they all have different arguments; many of which are
not well or not at all document, ...

So, I have just attributed this to being normal Windows inefficiencies.

Of course, the above is sort of a gut feeling.  Perhaps someone can do
some real performance testing and figure out what is really going on.

Best regards,
Kern

On 2/28/19 8:22 PM, Peter Milesson wrote:
Hi folks,

I'm backing up 2 servers with Bacula, one with Windows 2016, the other
one with CentOS. The hardware is described below. The Windows server
is much more powerful than the Linux server in all respects, and
should theoretically deliver data to the Bacula server at a much
higher rate. But in reality, the Linux server delivers data about 7
times faster over the network, than the Windows server.

Is this completely normal, or should I start to check up the Windows
server for problems?

Best regards,

Peter


Windows server (file server, RDP-server, Hyper-V host with 2 very
lightly loaded VMs)
=====================================================================
Hardware: HP DL180 Gen9, Intel Xeon E5-2683v4, 48GB RAM, Smart Array
P440 Controller, 6x SAS 1GB (7200 rpm, 12 Gb/s) in RAID5
Network: 2x 10GbE to HPE 1950 switch (LACP)
OS: Windows 2016 (build 1607)
Throughput to Bacula server: 23-Feb 08:52 MySd JobId 991: Elapsed
time=00:26:09, Transfer rate=4.071 M Bytes/second


Linux server (plain file server with Samba)
==================================
Hardware: HP DL120 Gen9, Intel Xeon E5-2603v3, 8GB RAM, HP Dynamic
Smart Array B140i SATA Controller 2x SATA 2GB (7200 rpm) in RAID1
Network: 2x 1Gb to HPE 1950 switch (LACP)
OS: CentOS Linux 7.5 (1804)
Throughput to Bacula server: 23-Feb 08:26 MySd JobId 990: Elapsed
time=00:26:08, Transfer rate=28.29 M Bytes/second


Bacula server
===========
Hardware: standard motherboard with a 6-core AMD FX-6300 CPU, 4xSATA
8GB (7200 rpm) in RAID10
Network: Tehuti 10GbE NIC to ProCurve 2910al switch
OS: CentOS Linux 7.6 (1810)
Bacula server throughput to the RAID array: ca. 60 Mbytes/second

All switches are connected to our 10Gb/s optical network backbone.



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users




_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to