A very interesting thread 
(http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/)
 and some thinking about the design of SSD's lead to a experiment I did with 
the Intel X25-M SSD. The question was: 

Is my data safe, once it has reached the disk and has been commited to my 
application ?

All transactional safety in ZFS requires the correct impementation of the 
synchronize cache command (see 
http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg27264.html, where 
someone used Opensolaris within VirtualBox, which  per default - does ignore 
the cache flush command). Thus qualified hardware is VERY essential (also see 
http://www.snia.org/events/storage-developer2009/presentations/monday/JeffBonwick_zfs-What_Next-SDC09.pdf).

What I did (for a Intel X25-M G2 (default settings = write cache on) and a 
Seagate SATA drive (ST3500418AS)): 

a) Create a Pool 
b) Create a Programm that opens a file 
   synchronously and writes to the file. 
   It also prints the latest record written 
   successfully.
c) pull the power of the SATA disk 
d) power cycle everything 
e) open the pool again and verify the content 
   of the file is the one that has been to 
   the application 
    e1) if it is the same - nice hardware 
    e2) if it is NOT the same - BAD hardware 

What I found out was: 

Intel X25-M G2: 
  - If I pull the power cable much data is lost, altought commited to the app 
(some hundred)
  - If I pull the sata cable no data is lost
  
ST3500418AS: 
  - If I pull the power cable almost no data is lost, but still the last write 
is lost (strange!)
  - If I pull the sata cable no data is lost

Actually this result was partially expected. Howerver the one missing 
transaction in my SATA HDD Disk (Seagate) is strange. 

Unfortunately I do not have "enterprise SAS hardware" handy to verify that my 
test procedure is correct.

Maybe someone can run this test on a SAS test machine ? (see script attached)


--- Attachments ---

--- script (call it with script.pl --file /mypool/testfile) ---

#!/usr/bin/env perl

# for O_SYNC
use Fcntl qw(:DEFAULT :flock SEEK_CUR SEEK_SET SEEK_END);
use IO::File;
use Getopt::Long;

my $pool="disk";
my $mountroot="/volumes";
my $file="$mountroot/$pool/testfile";
my $abort=0;
my $count=0;

GetOptions(
        "pool=s" => \$pool,
        "testfile|file=s" => \$file,
        "count=i" => \$count,
);

my $dir = $file;
$dir =~ s/[^\/]+$//g;

if (-e $file) {
        print "ERROR: File $file already exists\n";
        exit 1;
}

if (! -d "$dir" ) {
                print "ERROR: Directory $dir does not exist\n";
                exit 1;
}
sysopen (FILE, "$file", O_RDWR | O_CREAT | O_EXCL | O_SYNC) or die "ERROR 
Opening file $file: $!\n";

$SIG{INT}= sub { print " ... signalling Abort ... (file: $file)\n"; $abort=1; };

$|=1;

my $lastok=undef;
my $i=0;
my $msg=sprintf("This is round number %20s", $i);
# O_SYNC, O_CREAT
while (!$abort) {
        $i++;

        if ($count && $i>$count) { last; };

        $msg=sprintf("This is round number %20s", $i);
        sysseek (FILE, 0, SEEK_SET);
        print "$msg";
        my $rc=syswrite FILE,$msg;
        if (!defined($rc)) {
                print "ERROR\n";
                print "ERROR While writing $msg\n";
                print "ERROR: $!\n";
                last;
        } else {
                print " DONE \n";
                $lastok=$msg;
        }
}

close(FILE);

print "\nTHE LAST MESSAGE WRITTEN to file $file was:\n\n\t\"$lastok\"\n\n";

---- Here's the logs of my tests ----

1) Test the SATA SSD (Intel X25-M) 
----------------------------------
.. start write.pl

This is round number                67482
This is round number                67483
This is round number                67484
This is round number                67485
This is round number                67486
This is round number                67487
This is round number                67488
This is round number                67489
This is round number                67490

( .. I pull the POWER CABLE of the SATA SSD .. )

.. I/O hangs 

.. zpool status shows 

zpool status -v
  pool: ssd
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-JQ
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        ssd         UNAVAIL      0    11     0  insufficient replicas
          c3t5d0    UNAVAIL      3     2     0  cannot open

errors: Permanent errors have been detected in the following files:

        ssd:<0x0>
        /volumes/ssd/
        /volumes/ssd/testfile


... now I power cycled the machine and put back the power cable 

... lets see the pool status 

  pool: ssd
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        ssd         ONLINE       0     0     0
          c3t5d0    ONLINE       0     0     0

errors: No known data errors

.... lets look at the file content ...
.. remember last reported successful transaction was "67490"
r...@nexenta:/volumes/ssd# cat testfile
This is round number                67246
r...@nexenta:/volumes/ssd#

... UPS 244 transactions missing - bummer ... 

.. Ok repeeat the test with pulling the SATA cable only !!!
(thus the device has time to write out the changes) 

This is round number                39451
This is round number                39452
This is round number                39453
This is round number                39454
This is round number                39455
This is round number                39456
This is round number                39457
This is round number                39458
This is round number                39459
.. hangs 
.. reboot 

  pool: ssd
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        ssd         ONLINE       0     0     0
          c3t5d0    ONLINE       0     0     0

... cat ssd/testfile (last commited = 39459)
This is round number                39459
.. this is OK

1) Test the SATA HDD (Seagate ST3500418AS) 

..... same test with a HDD ... 

This is round number                 3548
This is round number                 3549
This is round number                 3550
This is round number                 3551
This is round number                 3552
This is round number                 3553
This is round number                 3554
This is round number                 3555
This is round number                 3556
This is round number                 3557
This is round number                 3558
.. hangs 

  pool: disk
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-JQ
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        disk        UNAVAIL      0    10     0  insufficient replicas
          c3t5d0    UNAVAIL      3     2     0  cannot open


                  
.. reboot 

.. check file  (last commited = 3558)

n...@nexenta:/disk$ cat testfile
This is round number                 3557

.. Again one transaction missing, strange, test again ...

.. Again (Disk) ...

This is round number                 1689 DONE
This is round number                 1690 DONE
This is round number                 1691 DONE
This is round number                 1692 DONE
This is round number                 1693 DONE
This is round number                 1694 DONE
This is round number                 1695

.. pull power cable 
.. reboot 
.. check 

n...@nexenta:/$ cat disk/testfile
This is round number                 1693

... again just one missing

.. test the SATA cable pull ....

This is round number                 1269 DONE
This is round number                 1270 DONE
This is round number                 1271 DONE
This is round number                 1272 DONE
This is round number                 1273 DONE
This is round number                 1274 DONE
This is round number                 1275 DONE
This is round number                 1276 DONE
This is round number                 1277
.. pull sata cable (not power)

n...@nexenta:/$ cat disk/testfile
This is round number                 1276

.. this is OK
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to