date:20120711

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Nico Williams

You can treat whatever hash function as an idealized one, but actual hash functions aren't. There may well be as-yet-undiscovered input bit pattern ranges where there's a large density of collisions in some hash function, and indeed, since our hash functions aren't ideal, there must be. We just d

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

On Jul 11, 2012, at 12:06 PM, Sašo Kiselkov wrote: >> I say, in fact that the total number of unique patterns that can exist on >> any pool is small, compared to the total, illustrating that I understand how >> the key space for the algorithm is small when looking at a ZFS pool, and >> thus co

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gary

On Wed, Jul 11, 2012 at 7:48 AM, Casper Dik wrote: > Dan Brown seems to think so in "Digital Fortress" but it just means he > has no grasp on "big numbers". > Or little else, for that matter. I seem to recall one character in the book that would routinely slide under a "mainframe" on his back as

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Richard Elling

On Jul 11, 2012, at 1:06 PM, Bill Sommerfeld wrote: > on a somewhat less serious note, perhaps zfs dedup should contain "chinese > lottery" code (see http://tools.ietf.org/html/rfc3607 for one explanation) > which asks the sysadmin to report a detected sha-256 collision to > eprint.iacr.org or the

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 10:06 PM, Bill Sommerfeld wrote: > On 07/11/12 02:10, Sašo Kiselkov wrote: >> Oh jeez, I can't remember how many times this flame war has been going >> on on this list. Here's the gist: SHA-256 (or any good hash) produces a >> near uniform random distribution of output. Thus, the chan

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Fabian Keil

Bob Friesenhahn wrote: > On Wed, 11 Jul 2012, Eugen Leitl wrote: > > > > It would be interesting to see when zpool versions >28 will > > be available in the open forks. Particularly encryption is > > a very useful functionality. > > Illumos advanced to zpool version 5000 and this is available in

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bill Sommerfeld

On 07/11/12 02:10, Sašo Kiselkov wrote: > Oh jeez, I can't remember how many times this flame war has been going > on on this list. Here's the gist: SHA-256 (or any good hash) produces a > near uniform random distribution of output. Thus, the chances of getting > a random hash collision are around

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Hung-Sheng Tsao Ph.D.

On 7/11/2012 3:16 PM, Bob Friesenhahn wrote: On Wed, 11 Jul 2012, Hung-Sheng Tsao (LaoTsao) Ph.D wrote: Not correct as far as I can tell. You should re-read the page you referenced. Oracle recinded (or lost) the special Studio releases needed to build the OpenSolaris kernel. you can stil

Re: [zfs-discuss] Scenario sanity check

2012-07-11 Thread Brian Wilson

On 07/ 9/12 04:36 PM, Ian Collins wrote: On 07/10/12 05:26 AM, Brian Wilson wrote: Yep, thanks, and to answer Ian with more detail on what TruCopy does. TruCopy mirrors between the two storage arrays, with software running on the arrays, and keeps a list of dirty/changed 'tracks' while the mirro

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn

On Wed, 11 Jul 2012, Hung-Sheng Tsao (LaoTsao) Ph.D wrote: Not correct as far as I can tell. You should re-read the page you referenced. Oracle recinded (or lost) the special Studio releases needed to build the OpenSolaris kernel. you can still download 12 12.1 12.2, AFAIK through OTN Th

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Nico Williams

On Wed, Jul 11, 2012 at 3:45 AM, Sašo Kiselkov wrote: > It's also possible to set "dedup=verify" with "checksum=sha256", > however, that makes little sense (as the chances of getting a random > hash collision are essentially nil). IMO dedup should always verify. Nico -- _

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Hung-Sheng Tsao (LaoTsao) Ph.D

Sent from my iPad On Jul 11, 2012, at 13:11, Bob Friesenhahn wrote: > On Wed, 11 Jul 2012, Richard Elling wrote: >> The last studio release suitable for building OpenSolaris is available in >> the repo. >> See the instructions at >> http://wiki.illumos.org/display/illumos/How+To+Build+illumo

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Richard Elling

On Jul 11, 2012, at 10:23 AM, Sašo Kiselkov wrote: > Hi Richard, > > On 07/11/2012 06:58 PM, Richard Elling wrote: >> Thanks Sašo! >> Comments below... >> >> On Jul 10, 2012, at 4:56 PM, Sašo Kiselkov wrote: >> >>> Hi guys, >>> >>> I'm contemplating implementing a new fast hash algorithm in Il

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Richard Elling

On Jul 11, 2012, at 10:11 AM, Bob Friesenhahn wrote: > On Wed, 11 Jul 2012, Richard Elling wrote: >> The last studio release suitable for building OpenSolaris is available in >> the repo. >> See the instructions at >> http://wiki.illumos.org/display/illumos/How+To+Build+illumos > > Not correct a

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn

On Wed, 11 Jul 2012, Richard Elling wrote: The last studio release suitable for building OpenSolaris is available in the repo. See the instructions at http://wiki.illumos.org/display/illumos/How+To+Build+illumos Not correct as far as I can tell. You should re-read the page you referenced.

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 06:23 PM, Gregg Wonderly wrote: > What I'm saying is that I am getting conflicting information from your > rebuttals here. Well, let's address that then: > I (and others) say there will be collisions that will cause data loss if > verify is off. Saying that "there will be" withou

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Richard Elling

Thanks Sašo! Comments below... On Jul 10, 2012, at 4:56 PM, Sašo Kiselkov wrote: > Hi guys, > > I'm contemplating implementing a new fast hash algorithm in Illumos' ZFS > implementation to supplant the currently utilized sha256. No need to supplant, there are 8 bits for enumerating hash algorit

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn

On Wed, 11 Jul 2012, Sašo Kiselkov wrote: For example, the well-known block might be part of a Windows anti-virus package, or a Windows firewall configuration, and corrupting it might leave a Windows VM open to malware attack. True, but that may not be enough to produce a practical collision f

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda

On Wed, July 11, 2012 11:58, Gregg Wonderly wrote: > You're entirely sure that there could never be two different blocks that > can hash to the same value and have different content? [...] The odds of being hit by lighting (at least in the US) are about 1 in 700,000. I don't worry about that happe

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

What I'm saying is that I am getting conflicting information from your rebuttals here. I (and others) say there will be collisions that will cause data loss if verify is off. You say it would be so rare as to be impossible from your perspective. Tomas says, well then lets just use the hash value

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 05:58 PM, Gregg Wonderly wrote: > You're entirely sure that there could never be two different blocks that can > hash to the same value and have different content? > > Wow, can you just send me the cash now and we'll call it even? You're the one making the positive claim and I'm ca

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

You're entirely sure that there could never be two different blocks that can hash to the same value and have different content? Wow, can you just send me the cash now and we'll call it even? Gregg On Jul 11, 2012, at 9:59 AM, Sašo Kiselkov wrote: > On 07/11/2012 04:56 PM, Gregg Wonderly wrote:

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Bob Friesenhahn

On Wed, 11 Jul 2012, Eugen Leitl wrote: It would be interesting to see when zpool versions >28 will be available in the open forks. Particularly encryption is a very useful functionality. Illumos advanced to zpool version 5000 and this is available in the latest OpenIndiana development releas

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 05:33 PM, Bob Friesenhahn wrote: > On Wed, 11 Jul 2012, Sašo Kiselkov wrote: >> >> The reason why I don't think this can be used to implement a practical >> attack is that in order to generate a collision, you first have to know >> the disk block that you want to create a collision on

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik

>On Wed, Jul 11, 2012 at 9:48 AM, wrote: >>>Huge space, but still finite=85 >> >> Dan Brown seems to think so in "Digital Fortress" but it just means he >> has no grasp on "big numbers". > >I couldn't get past that. I had to put the book down. I'm guessing >it was as awful as it threatened to

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Nico Williams

On Wed, Jul 11, 2012 at 9:48 AM, wrote: >>Huge space, but still finite=85 > > Dan Brown seems to think so in "Digital Fortress" but it just means he > has no grasp on "big numbers". I couldn't get past that. I had to put the book down. I'm guessing it was as awful as it threatened to be. IMO,

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn

On Wed, 11 Jul 2012, Sašo Kiselkov wrote: The reason why I don't think this can be used to implement a practical attack is that in order to generate a collision, you first have to know the disk block that you want to create a collision on (or at least the checksum), i.e. the original block is al

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda

On Wed, July 11, 2012 10:23, casper@oracle.com wrote: > I think that I/O isn't getting as fast as CPU is; memory capacity and > bandwith and CPUs are getting faster. I/O, not so much. > (Apart from the one single step from harddisk to SSD; but note that > I/O is limited to standard interfaces

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 05:10 PM, David Magda wrote: > On Wed, July 11, 2012 09:45, Sašo Kiselkov wrote: > >> I'm not convinced waiting makes much sense. The SHA-3 standardization >> process' goals are different from "ours". SHA-3 can choose to go with >> something that's slower, but has a higher security m

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda

On Wed, July 11, 2012 09:45, Sao Kiselkov wrote: > I'm not convinced waiting makes much sense. The SHA-3 standardization > process' goals are different from "ours". SHA-3 can choose to go with > something that's slower, but has a higher security margin. I think that > absolute super-tight securit

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

I'm just suggesting that the "time frame" of when 256-bits or 512-bits is less safe, is closing faster than one might actually think, because social elements of the internet allow a lot more effort to be focused on a single "problem" than one might consider. Gregg Wonderly On Jul 11, 2012, a

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:56 PM, Gregg Wonderly wrote: > So, if I had a block collision on my ZFS pool that used dedup, and it had my > bank balance of $3,212.20 on it, and you tried to write your bank balance of > $3,292,218.84 and got the same hash, no verify, and thus you got my > block/balance and no

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:54 PM, Ferenc-Levente Juhos wrote: > You don't have to store all hash values: > a. Just memorize the first one SHA256(0) > b. start cointing > c. bang: by the time you get to 2^256 you get at least a collision. Just one question: how long do you expect this going to take on averag

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

So, if I had a block collision on my ZFS pool that used dedup, and it had my bank balance of $3,212.20 on it, and you tried to write your bank balance of $3,292,218.84 and got the same hash, no verify, and thus you got my block/balance and now your bank balance was reduced by 3 orders of magnitu

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos

You don't have to store all hash values: a. Just memorize the first one SHA256(0) b. start cointing c. bang: by the time you get to 2^256 you get at least a collision. (do this using BOINC, you dont have to wait for the last hash to be calculated, I'm pretty sure a collision will occur sooner) 1.

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik

>Do you need assurances that in the next 5 seconds a meteorite won't fall >to Earth and crush you? No. And yet, the Earth puts on thousands of tons >of weight each year from meteoric bombardment and people have been hit >and killed by them (not to speak of mass extinction events). Nobody has >eve

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey

> From: Gregg Wonderly [mailto:gr...@wonderly.org] > Sent: Wednesday, July 11, 2012 10:28 AM > > Unfortunately, the government imagines that people are using their home > computers to compute hashes and try and decrypt stuff. Look at what is > happening with GPUs these days. heheheh. I guess

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

Yes, but from the other angle, the number of unique 128K blocks that you can store on your ZFS pool, is actually finitely small, compared to the total space. So the patterns you need to actually consider is not more than the physical limits of the universe. Gregg Wonderly On Jul 11, 2012, at

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik

>Unfortunately, the government imagines that people are using their home com= >puters to compute hashes and try and decrypt stuff. Look at what is happen= >ing with GPUs these days. People are hooking up 4 GPUs in their computers = >and getting huge performance gains. 5-6 char password space co

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:39 PM, Ferenc-Levente Juhos wrote: > As I said several times before, to produce hash collisions. Or to calculate > rainbow tables (as a previous user theorized it) you only need the > following. > > You don't need to reproduce all possible blocks. > 1. SHA256 produces a 256 bit ha

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Gregg Wonderly > > But this is precisely the kind of "observation" that some people seem to miss > out on the importance of. As Tomas suggested in his post, if this was true, > then we could h

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:36 PM, Justin Stringfellow wrote: > > >> Since there is a finite number of bit patterns per block, have you tried to >> just calculate the SHA-256 or SHA-512 for every possible bit pattern to see >> if there is ever a collision? If you found an algorithm that produced no >> c

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:30 PM, Gregg Wonderly wrote: > This is exactly the issue for me. It's vital to always have verify on. If > you don't have the data to prove that every possible block combination > possible, hashes uniquely for the "small" bit space we are talking about, > then how in the world

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:27 PM, Gregg Wonderly wrote: > Unfortunately, the government imagines that people are using their home > computers to compute hashes and try and decrypt stuff. Look at what is > happening with GPUs these days. People are hooking up 4 GPUs in their > computers and getting huge

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos

As I said several times before, to produce hash collisions. Or to calculate rainbow tables (as a previous user theorized it) you only need the following. You don't need to reproduce all possible blocks. 1. SHA256 produces a 256 bit hash 2. That means it produces a value on 256 bits, in other words

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Justin Stringfellow

> Since there is a finite number of bit patterns per block, have you tried to > just calculate the SHA-256 or SHA-512 for every possible bit pattern to see > if there is ever a collision? If you found an algorithm that produced no > collisions for any possible block bit pattern, wouldn't that

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:23 PM, casper@oracle.com wrote: > >> On Tue, 10 Jul 2012, Edward Ned Harvey wrote: >>> >>> CPU's are not getting much faster. But IO is definitely getting faster. >>> It's best to keep ahea > d of that curve. >> >> It seems that per-socket CPU performance is doubling every

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn

On Wed, 11 Jul 2012, Joerg Schilling wrote: Bob Friesenhahn wrote: On Tue, 10 Jul 2012, Edward Ned Harvey wrote: CPU's are not getting much faster. But IO is definitely getting faster. It's best to keep ahead of that curve. It seems that per-socket CPU performance is doubling every yea

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

This is exactly the issue for me. It's vital to always have verify on. If you don't have the data to prove that every possible block combination possible, hashes uniquely for the "small" bit space we are talking about, then how in the world can you say that "verify" is not necessary? That jus

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:22 PM, Bob Friesenhahn wrote: > On Wed, 11 Jul 2012, Sašo Kiselkov wrote: >> the hash isn't used for security purposes. We only need something that's >> fast and has a good pseudo-random output distribution. That's why I >> looked toward Edon-R. Even though it might have security p

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 04:19 PM, Gregg Wonderly wrote: > But this is precisely the kind of "observation" that some people seem to miss > out on the importance of. As Tomas suggested in his post, if this was true, > then we could have a huge compression ratio as well. And even if there was > 10% of the

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

Unfortunately, the government imagines that people are using their home computers to compute hashes and try and decrypt stuff. Look at what is happening with GPUs these days. People are hooking up 4 GPUs in their computers and getting huge performance gains. 5-6 char password space covered i

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey

> From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] > Sent: Wednesday, July 11, 2012 10:06 AM > > On Tue, 10 Jul 2012, Edward Ned Harvey wrote: > > > > CPU's are not getting much faster. But IO is definitely getting faster. It's > best to keep ahead of that curve. > > It seems that per

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik

>On Tue, 10 Jul 2012, Edward Ned Harvey wrote: >> >> CPU's are not getting much faster. But IO is definitely getting faster. >> It's best to keep ahea d of that curve. > >It seems that per-socket CPU performance is doubling every year. >That seems like faster to me. I think that I/O isn't get

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos

You don't need to reproduce all possible blocks. 1. SHA256 produces a 256 bit hash 2. That means it produces a value on 256 bits, in other words a value between 0..2^256 - 1 3. If you start counting from 0 to 2^256 and for each number calculate the SHA256 you will get at least one hash collision (i

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn

On Wed, 11 Jul 2012, Sašo Kiselkov wrote: the hash isn't used for security purposes. We only need something that's fast and has a good pseudo-random output distribution. That's why I looked toward Edon-R. Even though it might have security problems in itself, it's by far the fastest algorithm in

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

But this is precisely the kind of "observation" that some people seem to miss out on the importance of. As Tomas suggested in his post, if this was true, then we could have a huge compression ratio as well. And even if there was 10% of the bit patterns that created non-unique hashes, you could

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Joerg Schilling

Bob Friesenhahn wrote: > On Tue, 10 Jul 2012, Edward Ned Harvey wrote: > > > > CPU's are not getting much faster. But IO is definitely getting faster. > > It's best to keep ahead of that curve. > > It seems that per-socket CPU performance is doubling every year. > That seems like faster to me

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 03:58 PM, Edward Ned Harvey wrote: >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >> boun...@opensolaris.org] On Behalf Of Sašo Kiselkov >> >> I really mean no disrespect, but this comment is so dumb I could swear >> my IQ dropped by a few tenths of a point just by

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Gregg Wonderly > > Since there is a finite number of bit patterns per block, have you tried to just > calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there > is ever a

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn

On Tue, 10 Jul 2012, Edward Ned Harvey wrote: CPU's are not getting much faster. But IO is definitely getting faster. It's best to keep ahead of that curve. It seems that per-socket CPU performance is doubling every year. That seems like faster to me. If server CPU chipsets offer accelle

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Sašo Kiselkov > > As your dedup > ratio grows, so does the performance hit from dedup=verify. At, say, > dedupratio=10.0x, on average, every write results in 10 reads. Why? If you intend to

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 03:57 PM, Gregg Wonderly wrote: > Since there is a finite number of bit patterns per block, have you tried to > just calculate the SHA-256 or SHA-512 for every possible bit pattern to see > if there is ever a collision? If you found an algorithm that produced no > collisions for a

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Sašo Kiselkov > > On 07/11/2012 11:53 AM, Tomas Forsman wrote: > > On 11 July, 2012 - Sa??o Kiselkov sent me these 1,4K bytes: > >> Oh jeez, I can't remember how many times this flame war has b

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly

Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision? If you found an algorithm that produced no collisions for any possible block bit pattern, wouldn't that be the wi

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 03:39 PM, David Magda wrote: > On Tue, July 10, 2012 19:56, SaÅ¡o Kiselkov wrote: >> However, before I start out on a pointless endeavor, I wanted to probe >> the field of ZFS users, especially those using dedup, on whether their >> workloads would benefit from a faster hash algorithm

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda

On Tue, July 10, 2012 19:56, SaÅ¡o Kiselkov wrote: > However, before I start out on a pointless endeavor, I wanted to probe > the field of ZFS users, especially those using dedup, on whether their > workloads would benefit from a faster hash algorithm (and hence, lower > CPU utilization). Developme

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda

On Wed, July 11, 2012 04:50, Ferenc-Levente Juhos wrote: > Actually although as you pointed out that the chances to have an sha256 > collision is minimal, but still it can happen, that would mean > that the dedup algorithm discards a block that he thinks is a duplicate. > Probably it's anyway bette

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Eugen Leitl

On Wed, Jul 11, 2012 at 08:48:54AM -0400, Hung-Sheng Tsao Ph.D. wrote: > hi > if U have not check this page please do > http://en.wikipedia.org/wiki/ZFS > interesting info about the status of ZFS in various OS > regards Thanks for the pointer. It doesn't answer my question though -- where the mo

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 01:51 PM, Eugen Leitl wrote: > > As a napp-it user who recently needs to upgrade from NexentaCore I recently > saw > "preferred for OpenIndiana live but running under Illumian, NexentaCore and > Solaris 11 (Express)" > as a system recommendation for napp-it. > > I wonder about th

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Hung-Sheng Tsao Ph.D.

hi if U have not check this page please do http://en.wikipedia.org/wiki/ZFS interesting info about the status of ZFS in various OS regards my 2c 1)if you have the money buy ZFS appliance 2)if you want to build your self napp-it get solaris 11 support, it only charge the SW/socket and not change

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 01:42 PM, Justin Stringfellow wrote: >> This assumes you have low volumes of deduplicated data. As your dedup >> ratio grows, so does the performance hit from dedup=verify. At, say, >> dedupratio=10.0x, on average, every write results in 10 reads. > > Well you can't make an omelette

[zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Eugen Leitl

As a napp-it user who recently needs to upgrade from NexentaCore I recently saw "preferred for OpenIndiana live but running under Illumian, NexentaCore and Solaris 11 (Express)" as a system recommendation for napp-it. I wonder about the future of OpenIndiana and Illumian, which fork is likely t

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 01:36 PM, casper@oracle.com wrote: > > >> This assumes you have low volumes of deduplicated data. As your dedup >> ratio grows, so does the performance hit from dedup=verify. At, say, >> dedupratio=10.0x, on average, every write results in 10 reads. > > I don't follow. > > If

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Justin Stringfellow

> This assumes you have low volumes of deduplicated data. As your dedup > ratio grows, so does the performance hit from dedup=verify. At, say, > dedupratio=10.0x, on average, every write results in 10 reads. Well you can't make an omelette without breaking eggs! Not a very nice one, anyway. Yes

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik

>This assumes you have low volumes of deduplicated data. As your dedup >ratio grows, so does the performance hit from dedup=verify. At, say, >dedupratio=10.0x, on average, every write results in 10 reads. I don't follow. If dedupratio == 10, it means that each item is *referenced* 10 times but

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 01:09 PM, Justin Stringfellow wrote: >> The point is that hash functions are many to one and I think the point >> was about that verify wasn't really needed if the hash function is good >> enough. > > This is a circular argument really, isn't it? Hash algorithms are never > perfect,

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 12:37 PM, Ferenc-Levente Juhos wrote: > Precisely, I said the same thing a few posts before: > dedup=verify solves that. And as I said, one could use dedup= algorithm>,verify with > an inferior hash algorithm (that is much faster) with the purpose of > reducing the number of dedup can

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 12:32 PM, Ferenc-Levente Juhos wrote: > Saso, I'm not flaming at all, I happen to disagree, but still I understand > that > chances are very very very slim, but as one poster already said, this is > how > the lottery works. I'm not saying one should make an exhaustive search with > tr

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Justin Stringfellow

> The point is that hash functions are many to one and I think the point > was about that verify wasn't really needed if the hash function is good > enough. This is a circular argument really, isn't it? Hash algorithms are never perfect, but we're trying to build a perfect one? It seems to me

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik

>On 07/11/2012 12:24 PM, Justin Stringfellow wrote: >>> Suppose you find a weakness in a specific hash algorithm; you use this >>> to create hash collisions and now imagined you store the hash collisions >>> in a zfs dataset with dedup enabled using the same hash algorithm. >> >> Sorry, but

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 12:24 PM, Justin Stringfellow wrote: >> Suppose you find a weakness in a specific hash algorithm; you use this >> to create hash collisions and now imagined you store the hash collisions >> in a zfs dataset with dedup enabled using the same hash algorithm. > > Sorry, but isn't t

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 12:00 PM, casper@oracle.com wrote: > > >> You do realize that the age of the universe is only on the order of >> around 10^18 seconds, do you? Even if you had a trillion CPUs each >> chugging along at 3.0 GHz for all this time, the number of processor >> cycles you will have exe

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 11:53 AM, Tomas Forsman wrote: > On 11 July, 2012 - Sa??o Kiselkov sent me these 1,4K bytes: >> Oh jeez, I can't remember how many times this flame war has been going >> on on this list. Here's the gist: SHA-256 (or any good hash) produces a >> near uniform random distribution of outp

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos

Precisely, I said the same thing a few posts before: dedup=verify solves that. And as I said, one could use dedup=,verify with an inferior hash algorithm (that is much faster) with the purpose of reducing the number of dedup candidates. For that matter one could use a trivial CRC32, if the two bloc

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos

Saso, I'm not flaming at all, I happen to disagree, but still I understand that chances are very very very slim, but as one poster already said, this is how the lottery works. I'm not saying one should make an exhaustive search with trillions of computers just to produce a sha256 collision. If I wa

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik

>Sorry, but isn't this what dedup=verify solves? I don't see the problem here. >Maybe all that's needed is a comment in the manpage saying hash algorithms >aren't perfect. The point is that hash functions are many to one and I think the point was about that verify wasn't really needed if the hash

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Justin Stringfellow

>>You do realize that the age of the universe is only on the order of >>around 10^18 seconds, do you? Even if you had a trillion CPUs each >>chugging along at 3.0 GHz for all this time, the number of processor >>cycles you will have executed cumulatively is only on the order 10^40, >>still 37 order

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik

>You do realize that the age of the universe is only on the order of >around 10^18 seconds, do you? Even if you had a trillion CPUs each >chugging along at 3.0 GHz for all this time, the number of processor >cycles you will have executed cumulatively is only on the order 10^40, >still 37 orders o

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Tomas Forsman

On 11 July, 2012 - Sa??o Kiselkov sent me these 1,4K bytes: > On 07/11/2012 10:50 AM, Ferenc-Levente Juhos wrote: > > Actually although as you pointed out that the chances to have an sha256 > > collision is minimal, but still it can happen, that would mean > > that the dedup algorithm discards a b

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Joerg Schilling

Sa?o Kiselkov wrote: > On 07/11/2012 10:47 AM, Joerg Schilling wrote: > > Sa??o Kiselkov wrote: > > > >> write in case verify finds the blocks are different). With hashes, you > >> can leave verify off, since hashes are extremely unlikely (~10^-77) to > >> produce collisions. > > > > This is h

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 10:50 AM, Ferenc-Levente Juhos wrote: > Actually although as you pointed out that the chances to have an sha256 > collision is minimal, but still it can happen, that would mean > that the dedup algorithm discards a block that he thinks is a duplicate. > Probably it's anyway better to

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 11:02 AM, Darren J Moffat wrote: > On 07/11/12 00:56, Sašo Kiselkov wrote: >> * SHA-512: simplest to implement (since the code is already in the >> kernel) and provides a modest performance boost of around 60%. > > FIPS 180-4 introduces SHA-512/t support and explicitly SHA-512/

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 10:47 AM, Joerg Schilling wrote: > Sa??o Kiselkov wrote: > >> write in case verify finds the blocks are different). With hashes, you >> can leave verify off, since hashes are extremely unlikely (~10^-77) to >> produce collisions. > > This is how a lottery works. the chance is low b

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Darren J Moffat

On 07/11/12 00:56, Sašo Kiselkov wrote: * SHA-512: simplest to implement (since the code is already in the kernel) and provides a modest performance boost of around 60%. FIPS 180-4 introduces SHA-512/t support and explicitly SHA-512/256. http://csrc.nist.gov/publications/fips/fips180-4/f

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos

I'm pushing the send button too often, but yes, considering what said before, byte-to-byte comparison should be mandatory when deduplicating, and therefore a "lighter" hash or checksum algorithm, would suffice to reduce the number of dedup candidates. And overall deduping would be "bulletproof" and

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos

Actually although as you pointed out that the chances to have an sha256 collision is minimal, but still it can happen, that would mean that the dedup algorithm discards a block that he thinks is a duplicate. Probably it's anyway better to do a byte to byte comparison if the hashes match to be sure

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Joerg Schilling

Sa??o Kiselkov wrote: > write in case verify finds the blocks are different). With hashes, you > can leave verify off, since hashes are extremely unlikely (~10^-77) to > produce collisions. This is how a lottery works. the chance is low but some people still win. q~A ___

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov

On 07/11/2012 10:41 AM, Ferenc-Levente Juhos wrote: > I was under the impression that the hash (or checksum) used for data > integrity is the same as the one used for deduplication, > but now I see that they are different. They are the same "in use", i.e. once you switch dedup on, that implies che

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos

I was under the impression that the hash (or checksum) used for data integrity is the same as the one used for deduplication, but now I see that they are different. On Wed, Jul 11, 2012 at 10:23 AM, Sašo Kiselkov wrote: > On 07/11/2012 09:58 AM, Ferenc-Levente Juhos wrote: > > Hello all, > > > >

1 2 >

1 - 100 of 103 matches

Mail list logo