Re: [zfs-discuss] Dedup... still in beta status

Erik Trimble Tue, 15 Jun 2010 10:54:23 -0700

On 6/15/2010 9:03 AM, Fco Javier Garcia wrote:

Data: 90% of current computers has less than 9 GB of RAM, less than 5% has SSD 
systems.
Let use a computer storage "standard", with a capacity of 4 TB ... dedupe on, 
dataset with blocks of 32 kb ..., 2 TB of data in use ... need 16 GB of memory just only 
for DTT ... but this will not see it until it's too late ... ie, we will work with the 
system ... performance will be good ... Little by little we will see that write 
performance is dropping ... then we will see that the system crashes randomly (when 
deleting automatic snapshots) ... and finally will see that disabling dedup doesnt solve 
it.



It may indicate that dedupe has some requirements ... that is true, but what is 
 true too is that in systems with large amounts of   RAM(for the usual 
parameters) usual operations as file deleting  or datasets/snapshot destroying 
give us  a decrease of performance ... even totally blocking system ... and 
that is not admissible ... so maybe it would be  desirable to place dedupe in a 
freeze (beta or development situation) until we can get one stable version so 
we can  make any necessary changes in the nucleus of zfs that allow its use 
without compromising the integrity of the entire system (p.ejm: Enabling the 
erasing of blocks in multithreading .... .)

And what can we do if we have a system already "contaminated" with dedupe? ...
1st Disable snapshots
2. Create a new dataset without dedupe and copy the data to the new dataset.
3. After copying the data, delete the snapshots... first "the smaller", if 
there is some snapshot bigger (more than 10 Gb)... make progresive roollback to it  (Thus 
the snapshot will use 0 bytes) and we can delete.
4. When there are no snapshots in the dataset ... remove slowly (in batches) 
all  files.
5. Finally, when there are no files... destroy de dataset

If we miss any of these steps (and directly try to delete a snapshot with 95 
Gb) , the system will crash ... if we try to delete the dataset and the system 
crashes ... by restarting your computer will crash the system too (since the 
operation will continue trying to erase )....

My test system: AMD Athlon X2 5400, 8 Gb RAM, RAIDZ 3 TB, dataset 1,7 Tb, 
snapshot: 87 Gb... tested with: OSOL 134, EON 0.6, Nexenta core 3.02, 
Nexentastor enterprise 3.02... all systems freezes when trying to delete 
snapshots... finally with rollback i could delete all snapshots... but when 
trying to destroy the dataset  ... The system is still processing the order ... 
(after 20 hours ... )

Frankly, dedup isn't practical for anything but enterprise-classmachines. It's certainly not practical for desktops or anything remotelylow-end.

This isn't just a ZFS issue - all implementations I've seen so farrequire enterprise-class solutions.

Realistically, I think people are overtly-enamored with dedup as afeature - I would generally only consider it worth-while in cases whereyou get significant savings. And by significant, I'm talking an order ofmagnitude space savings. A 2x savings isn't really enough to counteractthe down sides. Especially when even enterprise disk space is(relatively) cheap.

That all said, ZFS dedup is still definitely beta. There are knownsevere bugs and performance issues which will take time to fix, as notall of them have obvious solutions. Given current schedules, I predictthat it should be production-ready some time in 2011. *When* in 2011, Icouldn't hazard...


Maybe time to make Solaris 10 Update 12 or so? <grin>

--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup... still in beta status

Reply via email to