from:"remmm"

Very Slow Disk Writes when Writing Large Data Blocks

2017-06-02 Thread remmm

I'm seeing slow write speeds from both Python and C code on some Windows 
workstations.  In particular both Python "write" and numpy "tofile" method 
suffers from this issue.  I'm wondering if anyone has any ideas regarding if 
this is a known issue, know the cause, or how to resolve the issue?  The 
details are below.

The slow write speed issue seems to occur when writing data in blocks of larger 
than 32767 512-byte disk sectors.  In terms of speed, write speed seems as 
expected until one gets to this 32767 limit and then the speed falls off as if 
all data beyond this is processed byte-by-byte.  I can't prove this is what is 
happening -- but speed tests generally support this theory.  These write speeds 
are in the range of 18 to 25 MBytes per second for spinning disks and about 50 
Mbytes/sec for SSDs.  Keep in mind these numbers should be more like 120 
MBytes/sec for spinning disks and 300 MBytes/sec for SSDs.

This issue seems to be system specific.  I originally saw this on my HP z640 
workstation using Python 2.7 under Windows 7.  Originally it was numpy writes 
of large arrays in the 100GB size range that highlighted the issue, but I've 
since written test code using just python "write" too and get similar results 
using various block sizes.  I've since verified this using cygwin mingw64 C and 
with Visual Studio C 2013.  I've also tested this on a variety of other 
systems.  My laptop does not show this speed issue, and not all z640 systems 
seem to show this issue though I've found several that do. IT has tested this 
on a clean Windows 7 image and on a Windows 10 image using yet another Z640 
system and they get similar results.  I've also not seen any Linux systems show 
this issue though I don't have any Z640's with Linux on them.  I have however 
run my tests on Linux Mint 17 running under VirtualBox on the same Z640 that 
showed the speed issue and using both Wine and native python and both 
 showed good performance and no slowdown.

A work around for this seems to be to enable full caching for the drive in 
device manager with the subsequent risk of data corruption.  This suggests for 
example that the issue is byte-by-byte flushing of data beyond the 32767 limit 
and that perhaps full cashing mitigates this some how.  The other work around 
is to write all data in blocks of less than the 32767 limit (which is about 
16Mbytes) as mentioned above. Of course reducing block size only works if you 
have the source code and the time and inclination to modify it.  There is an 
indication that some of the commercial code we use for science and engineering 
also may suffer from this issue.  

The impact of this issue also seems application specific.  The issue only 
becomes annoying when your regularity writing files of significant size (above 
say 10GB).  It also depends on how an application writes data, so not all 
applications that create large files may exhibit this issue.  As an example, 
Python numpy tofile method has this issue for large enough arrays and is the 
reason I started to investigate.

I don't really know where to go with this.  Is this a Windows issue?  Is it an 
RTL issue?  Is it a hardware, device driver, or bios issue?  Is there a stated 
OS or library limit to buffer sizes to things like C fwrite or Python write 
which makes this an application issue? Thoughts?

Thanks,
remmm
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Very Slow Disk Writes when Writing Large Data Blocks (Posting On Python-List Prohibited)

2017-06-21 Thread remmm

> Have you tried booting up Linux on the same hardware, and running the same 
> tests? That would be a good way to narrow down whether the issue is hardware 
> or software.

No just other Linux systems.  Hardware in question is corporate system -- so 
gray area as to if I can or should boot Linux ... though people do at times.

Good suggestion though.

Rob

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Very Slow Disk Writes when Writing Large Data Blocks

2017-06-21 Thread remmm

> You'll only reach those numbers in the ideal situation. Is there just one
> program doing this disk i/o, sequentially, from a single thread?

The IO is sequential write of a stream of very large blocks of data onto a 
drive that is only say 30% full.  So yes you should be able to reach 120 mbytes 
per second and you do on some systems.  It's just that other systems including 
my primary system are a factor 7 to 10 slower for the same thing.

> Other than that the only other thing I can think of is interference of
> other programs on the system, such as malware protection or anti-virus 
> tooling that is trying to scan your big files at the same time.  
> That should be visible in Window's resource monitor tool.
IT has tried totally new image -- similar results.  But yes -- normally it's a 
corporate system with all the security bloatware.  Again, we have examples of 
systems with all the bloatware that work... so odd.

> Post it somewhere?
I'll look into putting the test kit on google drive.

Thanks,
Rob

-- 
https://mail.python.org/mailman/listinfo/python-list

Very Slow Disk Writes when Writing Large Data Blocks

Re: Very Slow Disk Writes when Writing Large Data Blocks (Posting On Python-List Prohibited)

Re: Very Slow Disk Writes when Writing Large Data Blocks

3 matches

Site Navigation

Mail list logo

Footer information