> Actually it does if you have compression turned on
> and the blocks
> compress away to 0 bytes.
>
> See
> http://src.opensolaris.org/source/xref/onnv/onnv-gate/
> usr/src/uts/common/fs/zfs/zio.c#zio_write_bp_init
>
> Specifically line 1005:
>
>1005 if (psize == 0) {
> 1006
Hi,
> I don't know about the rest of your test, but writing
> zeroes to a ZFS
> filesystem is probably not a very good test, because
> ZFS recognizes
> these blocks of zeroes and doesn't actually write
> anything. Unless
> maybe encryption is on, but maybe not even then.
Not true. If I want ZFS
Hello list,
I wanted to test deduplication a little and did a experiment.
My question was: can I dedupe infinite or is ther a upper limit ?
So for that I did a very basic test.
- I created a ramdisk-pool (1GB)
- enabled dedup and
- wrote zeros to it (in one single file) until an error is r
Hello ZFS guru's on the list :)
I started ZFS approx 1.something years ago and I'm following the discussions
here for some time now. What confused my all the time is the different
parameters and ZFS tunables and how they affect data integrity and
availability.
Now I took some time and tried
Hello,
probably a lot of people have done this, now its my time. I wanted to test the
performance of comstar over 8 GB FC. My Idea was to create a pool from a
ramdisk, a thin provisioned zvol over it and so some benchmarks.
However performance is worse then to the disk backend. So I measured
Hello,
I see strange behaviour when qualifying disk drives for ZFS. The tests I want
to run should make sure that the drives honour the cache flush command. For
this I do the following:
1) Create singe disk pools (only one disk in the pool)
2) Perorm I/O on the pools
This is done via SQLIte an
Everything that has readed the storage will be written to disk as sent.
However watch our for the writeback cache setting of comstar. If you enable a
writeback cache AND your machine boots very fast (< 2 Minutes), you may have
data integrity issues because Windows "thinks" the target was just s
I have to come back to this issue after a while cause it just hit me.
I have a VMWare vSphere 4 test host. I have various machines in there to do
tests for performance and other stuff. So a lot of IO/ benchmarks are done and
a lot of data is created during this benchmarks.
The vSphere test ma
Hello,
thanks for the feedback and sorry for the delay in answering.
I checked the log and the fmadm. It seems the log does not show changes,
however fmadm shows:
Apr 23 2010 18:32:26.363495457 ereport.io.scsi.cmd.disk.dev.rqs.derr
Apr 23 2010 18:32:26.363482031 ereport.io.scsi.cmd.disk.recov
I was going though this posting and it seems that were is some "personal
tension" :).
However going back to the technical problem of scrubbing a 200 TB pool I think
this issue needs to be addressed.
One warning up front: This writing is rather long, and if you like to jump to
the part dealin
Hello list,
a pool shows some strange status:
volume: zfs01vol
state: ONLINE
scrub: scrub completed after 1h21m with 0 errors on Sat Apr 24 04:22:38
2010
config:
NAME STATE READ WRITE CKSUM
zfs01vol ONLINE 0 0 0
mirror ONLINE
Hello list,
I started playing aroud with Comstar in snv_134. In snv_116 version of ZFS, a
new hidden property for the Comstar MetaData has been intoduced (stmf_sbd_lu).
This makes it possible to migrate from legacy (iscsi target daemon) to Comstar
without data loss, which is great.
Before th
> I guess it will then
> remain a mystery how did this happen, since I'm very
> careful when engaging the commands and I'm sure that
> I didn't miss the "raidz" parameter.
You can be sure by calling "zpool history".
Robert
--
This message posted from opensolaris.org
___
Hello,
wanted to know if there are any updates on this topic ?
Regards,
Robert
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Hello list,
when consolidating storage services, it may be required to prioritize I/O.
e.g. the important SAP database get all the I/O we can deliver and that it
needs. The test systems should use whats left.
While this is a difficult topic in disk based systems (even little I/o with
long h
Hello list,
it is damn difficult to destroy ZFS labels :)
I try to remove the vedev labels of disks used in a pool before. According to
http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf
I created a script that removes the first 512 KB and the last 512 KB, h
> with the Intel product...but save a few more pennies
> up and get the X-25M. The extra boost on read and
> write performance is worth it.
Or use multiple X25-V (L2ARC is not filled fast anyhow, so write does not
matter). You can get 4 of them for 1 160 GB X25-M. With 4 X25-V you get ~500 MB
/s
I use the Intel X25-V and I like it :)
Actuall I have 2 in a striped setup.
40 MB Write / Sec (just enought for ZIL filling)
something like 130 MB / sec reads. Just enough.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-d
This would be an idea and I thought about this. However I see the following
problems:
1) using deduplication
This will reduce the on disk size however the DDT will grow forever and for the
deletion of zvols this will mean a lot of time and work (see other threads
regarding DDT memory issues o
Hello list,
ZFS can be used in both file level (zfs) and block level access (zvol). When
using zvols, those are always thin provisioned (space is allocated on first
write). We use zvols with comstar to do iSCSI and FC access - and exuse me in
advance - but this may also be a more comstar relat
I fully agree. This needs fixing. I can think of so many situations, where
device names change in OpenSolaris (especially with movable pools). This
problem can lead to serious data corruption.
Besides persistent L2ARC (which is much more difficult I would say) - Making
L2ARC also rely on label
Hello list,
beeing a Linux Guy I'm actually quite new to Opensolaris. One thing I miss is
udev. I found that when using SATA disks with ZFS - it always required manual
intervention (cfgadm) to do SATA hot plug.
I would like to automate the disk replacement, so that it is a fully automatic
p
That sounds very interesting and for me it sounds like a bug.
I expect this is what happened:
1) Create a zvol with meta data in the first 64k (so data starts at 64k)
2) Updated to 124
3) Comstar propably still using the first 64k of the zvol no tmigrating the
data off (so data starts at 64k)
4
> Only with the zdb(1M) tool but note that the
> checksums are NOT of files
> but of the ZFS blocks.
Thanks - bocks, right (doh) - thats what I was missing. Damn it would be so
nice :(
--
This message posted from opensolaris.org
___
zfs-discuss mailin
Hello,
an idea popped into my mind while talking about security and intrusion
detection.
Host based ID may use Checksumming for file change tracking. It works like
this:
Once installed and knowning the software is "OK", a baseline is created.
Then in every check - verify the current status
Hello I have a strange issue,
I'm having a setup with 24 disk enclosure connected with LSI3801-R. I created
two pools. Pool have 24 healthy disks.
I disabled LUN persistency on the LSI adapter.
When I cold boot the server (power off by pulling all power cables), a warning
is shown on the co
When you send data (as the mate did) , all data is rewritten and the settings
you made (dedupe etc.) are effectivly applied.
If you change a parameter (dedupe, compression) this holds only true for NEWLY
written data. If you do not cange data, all data is still duped.
Also when you send, all
> > Created a pool on head1 containing just the cache
> device (c0t0d0).
>
> This is not possible, unless there is a bug. You
> cannot create a pool
> with only a cache device. I have verified this on
> b131:
> # zpool create norealpool cache /dev/ramdisk/rc1
> 1
> invalid vd
I tested some more and found that Pool disks are picked UP.
Head1: Cachedevice1 (c0t0d0)
Head2: Cachedevice2 (c0t0d0)
Pool: Shared, c1td
I created a pool on shared storage.
Added the cache device on Head1.
Switched the pool to Head2 (export + import).
Created a pool on head1 containing just t
Also write performance may drop because of write dache disable:
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Storage_Pools
Just a hint, have not tested this.
Robert
--
This message posted from opensolaris.org
___
zfs-discus
While thinking about ZFS as the next generation filesystem without limits I am
wondering if the real world is ready for this kind of incredible technology ...
I'm actually speaking of hardware :)
ZFS can handle a lot of devices. Once in the import bug
(http://bugs.opensolaris.org/bugdatabase/v
No picture, but something like this:
http://www.provantage.com/supermicro-aoc-smp-lsiss9252~7SUP91MC.htm ?
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/
Yes, here it is (performance is vmware on laptop, so sorry for that)
How did I test ?
1) My Disks:
LUN ID DeviceType Size Volume Mounted Remov Attach
c0t0d0 sd4 cdromNo Media no yes ata
c1t0d0 sd0 disk 8G
Actuall I tested this.
If I add a l2arc device to the syspool it is not used when issueing I/O to the
data pool (note: on root pool it must no be a whole disk, but only a slice of
it otherwise ZFS complains that root disks may not contain some EFI label).
So this does not work - unfortunately
Some very interesting insights on the availability calculations:
http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl
For streaming also look at:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6732803
Regards,
Robert
--
This message posted from opensolaris.o
One problem with the write cache is that I do not know if it is needed for
write wearing ?
As mentioned, disabeling write cache might be ok in terms of performance (I
want to use MLC SSD as data disks, not as ZIL, to have a SSD only appliance -
I'm looking for read speed for dedupe, zfs send
Hello,
I'm testing with snv_131 (nexentacore 3 alpha 4). I did a bonnie benchmark to
my disks and pulled a disk whil benchmarking. Everything went smoothly,however
I found that the now degrated device is excluded from the writes.
So this is my pool after I have pulled the disk
pool: m
Is there a way (besides format and causing heavy I/O on the device in question)
how to identify a drive. Is there some kind of SES (enclosure service) for this
??
(e.g. "and now let the red led blink")
Regards,
Robert
--
This message posted from opensolaris.org
___
I follow this list / forum now for some weeks. During this time the "I tried
dedup but on delete in hangs" popped up several times. I know that other
forum's use sticky threads to have like a "in band f.a.q.". The Dedupe issue
would be a candidate I guess.
I think this kind of feature would be
Thanks for the feedback Richard.
Does that mean that the L2ARC can be part of ANY pool and that there is only
ONE L2ARC for all pools active on the machine ?
Thesis:
- There is one L2ARC on the machine for all pools
- all Pools active share the same L2ARC
- the L2ARC can be part of any
Hi,
i found some time and was able to test again.
- verify with unique uid of the device
- verify with autoreplace = off
Indeed autoreplace was set to yes for the pools. So I disabled the autoreplace.
VOL PROPERTY VALUE SOURCE
nxvol2 autoreplaceoff default
Era
Actually I found some time (and reason) to test this.
Environment:
- 1 osol server
- one SLES10 iSCSI Target
- two LUN's exported via iSCSi to the OSol server
I did some rescilver tests to see how ZFS resilvers devices.
Prep:
osol: create a pool (myiscsi) with one mirror pair made from the
Hello,
we tested clustering with ZFS and the setup looks like this:
- 2 head nodes (nodea, nodeb)
- head nodes contain l2arc devices (nodea_l2arc, nodeb_l2arc)
- two external jbods
- two mirror zpools (pool1,pool2)
- each mirror is a mirror of one disk from each jbod
- no ZIL (anyone knows a
The on Disk Layout is shown here:
http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf
You can use the name value pairs in the vdev label. ( I guess). Unfortunately I
do not know any scripts.
--
This message posted from opensolaris.org
__
Actually for the ZIL you may use the a-card (memory sata disk + bbu + compact
flash write out).
For the data disks there is no solution yet - would be nice. However I prefer
the "supercapacitor on disk" method.
Why ? because the recharge logic is chellenging. There needs to be
communication
> > .. however ... a lot of snaps still have a impact
> on system performance. After the import of the 1
> snaps volume, I saw "devfsadm" eating up all CPU:
>
> If you are snapshotting ZFS volumes, then each will
> create an entry in the
> device tree. In other words, if these were file
> sys
Cause you mention the fixed / bugs I have a more general question.
Is there a way to see all commits to OSOL that are related to a Bug Report ?
Background: I'm interested in how e.g. the zfs import bug was fixed.
--
This message posted from opensolaris.org
_
Ok, tested this myself ...
(same hardware used for both tests)
OpenSolaris svn_104 (actually Nexenta Core 2):
100 Snaps
r...@nexenta:/volumes# time for i in $(seq 1 100); do zfs snapshot
ssd/v...@test1_$i; done
real0m24.991s
user0m0.297s
sys 0m0.679s
Import:
r...@
Maybe it is lost in this much text :) .. thus this re-post
Does anyone know the impact of disabeling the write cache for the write
amplification factor of the intel SSD's ?
How can I permanently disable the write cache on the Intel X25-M SSD's ?
Thanks, Robert
--
This message posted from ope
Actually the performance decrease when disableing the write cache on the SSD is
aprox 3x (aka 66%).
Setup:
node1 = Linux Client with open-iscsi
server = comstar (cache=write through) + zvol (recordsize=8k, compression=off)
--- with SSD-Disk-write cache disabled:
node1:/mnt/ssd# iozone -
I managed to disable the write cache (did not know a tool on Solaris, hoever
hdadm from the EON NAS binary_kit does the job):
Same power discuption test with Seagate HDD and write cache disabled ...
---
r...@nex
A very interesting thread
(http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/)
and some thinking about the design of SSD's lead to a experiment I did with
the Intel X25-M SSD. The question was:
Is my data safe, once it has reached the di
I finally managed to resolve this. I received some useful info from Richard
Elling (without List CC):
>> (ME) However I sill think, also the plain IDE driver needs a timeout to
>> hande disk failures, cause cables etc can fail.
>(Richard) Yes, this is a little bit odd. The sd driver should be
Depends.
a) Pool design
5 x SSD as raidZ = 4 SSD space - read I/O performance of one drive
Adding 5 cheap 40 GB L2ARC device (which are pooled) increases the read
performance for your working window of 200 GB.
If you have a pool of mirrors - adding L2ARC does not make sence.
b) SSD type
Is yo
See the reads on the pool with the low I/O ? I suspect reading the DDT causes
the writes to slow down.
See this bug
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6913566. It seems to
give some backgrounds.
Can you test setting the "primarycache=metadata" on the volume you test ?
Ok, after browsing I found that the sata disks are not shown via cfgadm.
I found http://opensolaris.org/jive/message.jspa?messageID=287791&tstart=0
which states that you have to set the mode to "AHCI" to enable hot-plug etc.
However I sill think, also the plain IDE driver needs a timeout to han
Ok,
I now waited 30 minutes - still hung. After that I pulled the SATA cable to the
L2ARC device also - still no success (I waited 10 minutes).
After 10 minutes I put the L2ARC device back (SATA + Power)
20 seconds after that the system continues to run.
dmesg shows:
Jan 8 15:41:57 nexe
Hello,
today I wanted to test that the failure of the L2ARC device is not crucial to
the pool. I added a Intel X25-M Postville (160GB) as cache device to a 54 disk
mittor pool. Then I startet a SYNC iozone on the pool:
iozone -ec -r 32k -s 2048m -l 2 -i 0 -i 2 -o
Pool:
pool
mirror-0
You could use NexetaStor (www.nexenta.com) which is a commercial storage
appliance, however - it based on opensolaris, but it is not just a package to
install.
Also there is EONStor (www.genunix.org).
--
This message posted from opensolaris.org
___
zf
Snapshots do not impact write performance. Deletion of the snapshots seems to
be also a constant operation (time taken = number of snapshots x some time).
However see http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6761786.
When importing a pool with many snapshots (which happens dur
Hello,
I'm thinking about a setup that looks like this:
- 2 headnodes with FC connectivity (OpenSolaris)
- 2 backend FC srtorages (Disk Shelves with RAID Controllers presenting a huge
15 TB RAID5)
- 2 datacenters (distance 1 km with dark fibre)
- one headnode and one storage in each data cente
61 matches
Mail list logo