Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Nico Williams
On Thu, Dec 29, 2011 at 6:44 PM, Matthew Ahrens wrote: > On Mon, Dec 12, 2011 at 11:04 PM, Erik Trimble wrote: >> (1) when constructing the stream, every time a block is read from a fileset >> (or volume), its checksum is sent to the receiving machine. The receiving >> machine then looks up that

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Matthew Ahrens
On Mon, Dec 12, 2011 at 11:04 PM, Erik Trimble wrote: > On 12/12/2011 12:23 PM, Richard Elling wrote: >> >> On Dec 11, 2011, at 2:59 PM, Mertol Ozyoney wrote: >> >>> Not exactly. What is dedup'ed is the stream only, which is infect not >>> very >>> efficient. Real dedup aware replication is taking

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Nico Williams
On Thu, Dec 29, 2011 at 9:53 AM, Brad Diggs wrote: > Jim, > > You are spot on.  I was hoping that the writes would be close enough to > identical that > there would be a high ratio of duplicate data since I use the same record > size, page size, > compression algorithm, … etc.  However, that was

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Robert Milkowski
Milkowski Cc: 'zfs-discuss discussion list' Subject: Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup Reducing the record size would negatively impact performance. For rational why, see the section titled "Match Average I/O Block Sizes" in my

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Brad Diggs
nd keep it higher after you start modifying data.  From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Brad DiggsSent: 28 December 2011 21:15To: zfs-discuss discussion listSubject: Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup As promised

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Brad Diggs
Illumos based distros I would expect L1 arc to grow much bigger.  From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Brad DiggsSent: 28 December 2011 21:15To: zfs-discuss discussion listSubject: Re: [zfs-discuss] Improving L1ARC cache efficiency with

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Brad Diggs
Jim,You are spot on.  I was hoping that the writes would be close enough to identical thatthere would be a high ratio of duplicate data since I use the same record size, page size,compression algorithm, … etc.  However, that was not the case.  The main thing that Iwanted to prove though was that if

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Jim Klimov
Thanks for running and publishing the tests :) A comment on your testing technique follows, though. 2011-12-29 1:14, Brad Diggs wrote: As promised, here are the findings from my testing. I created 6 directory server instances ... However, once I started modifying the data of the replicated dir

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-28 Thread Nico Williams
On Wed, Dec 28, 2011 at 3:14 PM, Brad Diggs wrote: > > The two key takeaways from this exercise were as follows.  There is > tremendous caching potential > through the use of ZFS deduplication.  However, the current block level > deduplication does not > benefit directory as much as it perhaps c

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-16 Thread Robert Milkowski
> -Original Message- > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Pawel Jakub Dawidek > Sent: 10 December 2011 14:05 > To: Mertol Ozyoney > Cc: zfs-discuss@opensolaris.org > Subject: Re: [zfs-discuss]

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-13 Thread Nico Williams
On Dec 11, 2011 5:12 AM, "Nathan Kroenert" wrote: > > On 12/11/11 01:05 AM, Pawel Jakub Dawidek wrote: >> >> On Wed, Dec 07, 2011 at 10:48:43PM +0200, Mertol Ozyoney wrote: >>> >>> Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. >>> >>> The only vendor i know that can do th

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-13 Thread Pawel Jakub Dawidek
On Mon, Dec 12, 2011 at 08:30:56PM +0400, Jim Klimov wrote: > 2011-12-12 19:03, Pawel Jakub Dawidek пишет: > > As I said, ZFS reading path involves no dedup code. No at all. > > I am not sure if we contradicted each other ;) > > What I meant was that the ZFS reading path involves reading > logica

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-12 Thread Erik Trimble
On 12/12/2011 12:23 PM, Richard Elling wrote: On Dec 11, 2011, at 2:59 PM, Mertol Ozyoney wrote: Not exactly. What is dedup'ed is the stream only, which is infect not very efficient. Real dedup aware replication is taking the necessary steps to avoid sending a block that exists on the other sto

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-12 Thread Richard Elling
On Dec 11, 2011, at 2:59 PM, Mertol Ozyoney wrote: > Not exactly. What is dedup'ed is the stream only, which is infect not very > efficient. Real dedup aware replication is taking the necessary steps to > avoid sending a block that exists on the other storage system. These exist outside of ZFS (e

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-12 Thread Jim Klimov
2011-12-12 19:03, Pawel Jakub Dawidek пишет: On Sun, Dec 11, 2011 at 04:04:37PM +0400, Jim Klimov wrote: I would not be surprised to see that there is some disk IO adding delays for the second case (read of a deduped file "clone"), because you still have to determine references to this second fi

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-12 Thread Brad Diggs
Thanks everyone for your input on this thread.  It sounds like there is sufficient weightbehind the affirmative that I will include this methodology into my performance analysistest plan.  If the performance goes well, I will share some of the results when we concludein January/February timeframe.R

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-12 Thread Mertol Ozyoney
I am almost sure that in cache things are still hydrated. There is an outstanding RFE for this, while I am not sure, I think this feature will be implemented sooner or later. And in theory there will be little benefits as most dedup'ed shares are used for archive purposes... PS: NetApp's do have s

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-12 Thread Mertol Ozyoney
Not exactly. What is dedup'ed is the stream only, which is infect not very efficient. Real dedup aware replication is taking the necessary steps to avoid sending a block that exists on the other storage system. Mertol Özyöney | Storage Sales Mobile: +90 533 931 0752 Em

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-12 Thread Pawel Jakub Dawidek
On Sun, Dec 11, 2011 at 04:04:37PM +0400, Jim Klimov wrote: > I would not be surprised to see that there is some disk IO > adding delays for the second case (read of a deduped file > "clone"), because you still have to determine references > to this second file's blocks, and another path of on-disk

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-11 Thread Gary Driggs
What kind of drives are we talking about? Even SATA drives are available according to application type (desktop, enterprise server, home PVR, surveillance PVR, etc). Then there are drives with SAS & fiber channel interfaces. Then you've got Winchester platters vs SSD vs hybrids. But even before con

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-11 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Nathan Kroenert > > That reminds me of something I have been wondering about... Why only 12x > faster? If we are effectively reading from memory - as compared to a > disk reading at approximate

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-11 Thread Jim Klimov
2011-12-11 15:10, Nathan Kroenert wrote: Hey all, That reminds me of something I have been wondering about... Why only 12x faster? If we are effectively reading from memory - as compared to a disk reading at approximately 100MB/s (which is about an average PC HDD reading sequentially), I'd have

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-11 Thread Nathan Kroenert
On 12/11/11 01:05 AM, Pawel Jakub Dawidek wrote: On Wed, Dec 07, 2011 at 10:48:43PM +0200, Mertol Ozyoney wrote: Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do this is Netapp And you really work at Oracle?:) The answer is definiately

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-10 Thread Pawel Jakub Dawidek
On Wed, Dec 07, 2011 at 10:48:43PM +0200, Mertol Ozyoney wrote: > Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. > > The only vendor i know that can do this is Netapp And you really work at Oracle?:) The answer is definiately yes. ARC caches on-disk blocks and dedup jus

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-08 Thread Mark Musante
You can see the original ARC case here: http://arc.opensolaris.org/caselog/PSARC/2009/557/20091013_lori.alt On 8 Dec 2011, at 16:41, Ian Collins wrote: > On 12/ 9/11 12:39 AM, Darren J Moffat wrote: >> On 12/07/11 20:48, Mertol Ozyoney wrote: >>> Unfortunetly the answer is no. Neither l1 nor l2

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-08 Thread Ian Collins
On 12/ 9/11 12:39 AM, Darren J Moffat wrote: On 12/07/11 20:48, Mertol Ozyoney wrote: Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do this is Netapp In fact , most of our functions, like replication is not dedup aware. For example, thec

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-08 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Mertol Ozyoney > Sent: Wednesday, December 07, 2011 3:49 PM > To: Brad Diggs > Cc: zfs-discuss@opensolaris.org > Subject: Re: [zfs-discuss] Improving L1ARC cache eff

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-08 Thread Darren J Moffat
On 12/07/11 20:48, Mertol Ozyoney wrote: Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do this is Netapp In fact , most of our functions, like replication is not dedup aware. For example, thecnicaly it's possible to optimize our replic

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-07 Thread Jim Klimov
It was my understanding that both dedup and caching work on block level. So if you have identical on-disk blocks (same original data past same compression and encryption), they turn into one(*) on-disk block with several references from DDT. And that one block is only cached once, saving ARC space

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-07 Thread Mertol Ozyoney
Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do this is Netapp In fact , most of our functions, like replication is not dedup aware. However we have significant advantage that zfs keeps checksums regardless of the dedup being on and o

[zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-07 Thread Brad Diggs
Hello,I have a hypothetical question regarding ZFS reduplication.  Does the L1ARC cache benefit from reduplicationin the sense that the L1ARC will only need to cache one copy of the reduplicated data versus many copies?  Here is an example:Imagine that I have a server with 2TB of RAM and a PB of di