Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> Ed Tomlinson  <[EMAIL PROTECTED]> wrote:
> >The default in reiserfs is now the R5 hash, but you are right that lots of
> > efforts went into finding this hash.  This includes testing various
> > hashes on real directory structures to see which one worked best.  R5
> > won.
>
> That's interesting.  The R5 hash is easily also the only one of the
> reiser hashes that might be useable for the generic VFS hashing.  It's
> not so different in spirit from the current one, and if you've done the
> work to test it, it's bound to be a lot better.

It was not me personally.   I just remembered the thread (from june 2000) on 
the reiserfs list...  I have summerized the results for you below.

For the program see: http://www.jedi.claranet.fr/hash_torture.tar.gz

Ed 

PS.  I am still seeing hangs with (2.4.2pre2 then I switched to ac7 or so and 
have had hangs with all pre and ac(s) tried and that is most of them)  ac20 
plus the latest reiserfs fixes has stayed up 8 hours so far - it can take two 
or three days  to trigger the hang though.  When it hangs it really dead,  a 
UPS connected via a serial port cannot shut it down.   pings to the box fail. 
A+SysRQ is dead, and the software watchdog does not trigger a reboot.  
ideas?

> (The current VFS name hash is probably _really_ stupid - I think it's
> still my original one, and nobody probably ever even tried to run it
> through any testing.  For example, I bet that using a shift factor of 4
> is really bad, because it evenly divides a byte, which together with the
> xor means that you can really easily generate trivial bad cases).
>
> What did you use for a test-case? Real-life directory contents? Did you
> do any worst-case analysis too?
>
>                Linus


some test results from june 2000 with Hans's summary first.
---------------------------------------------------------------
(reiserfs) Re: r5 hash
From: Hans Reiser <[EMAIL PROTECTED]>
To: "Yury Yu. Rupasov" <[EMAIL PROTECTED]>
Cc: Jedi/Sector One <[EMAIL PROTECTED]>, Petru Paler <[EMAIL PROTECTED]>, 
"[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, Yury Shevchuk 
<[EMAIL PROTECTED]>


Ok, based on this benchmark let's put rupasov5 in, and warn users who choose 
the
currently used rupasov1 hash that rupasov5 has obsoleted it.  Do this in both
3.6 and 3.5, and fix the the delimiting key check in 3.5 REISERFS_CHECK bug at
the same time.  Cut the patch, start testing, and see if you can release by
Monday.  Make rupasov5 the default.  sizif, review the documentation he 
creates
for users.

Jedi, if you disagree with the benchmarks let me know.  You might try
concatenating two filenames together instead of adding a digit to them, or
running find on a really large FS, to improve these tests.  Thanks for helping
us with analyzing the different hash methods available Jedi.

Hans

---------------------------------------------------------------
(reiserfs) Re: r5 hash
From: "Yury Yu. Rupasov" <[EMAIL PROTECTED]>
To: Hans Reiser <[EMAIL PROTECTED]>
Cc: Jedi/Sector One <[EMAIL PROTECTED]>, Petru Paler <[EMAIL PROTECTED]>, 
"[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, Yury Shevchuk 
<[EMAIL PROTECTED]>


Hans Reiser wrote:
> 
> What is the speed of the real filenames, not just the number of collisions.
> 



Ok, here is the results for real names :
# find / -type d -exec ls {} \; | sort | uniq > allfiles.txt

# wc -l allfiles.txt
161101 allfiles.txt

Collisions for 161 101 names:

tea_hash  : 784 total,  2 dangerous
jedi_hash2: 957 total,  2 dangerous 
r5_hash   :1191 total,  2 dangerous 
r7_hash   :8439 total, 18 dangerous


The speed for 161 101 real names :

create 161101 files of 10 bytes with names from allfiles.txt

# time create d1 allfiles.txt
# time cp d1 d2 -r
# time rm d1 -r

              create      copy        remove 
             --------------------------------
tea_hash   : 1m27.223s   5m43.069s  2m33.449s
jedi_hash2 : 1m26.062s   5m40.872s  2m32.795s
r5_hash    : 1m16.729s   4m14.967s  1m53.037s
r7_hash    : 1m10.665s   3m34.950s  1m39.756s


As you can see the results are differ, but not too much. :)
The situation changes dramatically if we will test 1 million files.

The same test, but at the end of each name from allfiles.txt 
added numbers from 0 to 6 (1 127 707 files):
 
              create      copy        remove 
             --------------------------------
tea_hash   : 81m44.449s  
jedi_hash2 : 79m46.419s
r5_hash    : 15m56.037s
r7_hash    : 15m30.680s

Dual Celeron 500, 128 MB RAM, 8 GB scsi HDD
Reiserfs-3.5.21, Linux-2.2.15

Thanks,
Yura.
---------------------------------------------------------------
body { font-family: "helvetica" } p { font-size: 12pt } a { color: #0000ff; 
text-decoration: none; }(reiserfs) Torture results
From: Jedi/Sector One <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]


  Here are the results of the hash torture on a Celeron 300.
  Once again, you can substract 1 from the dangerous collisions numbers.
  Xuan, can you provide a test for the case Rupasov hash was designed
for ?
  Anyway, I don't really see why large directories should have similar
file names, rather that keywords.

  Best regards,
-- 
         Frank DENIS aka Jedi/Sector One aka DJ Chrysalis <[EMAIL PROTECTED]>
                 -> Software : http://www.jedi.claranet.fr <-
      If Bill Gates had a dime for every time a Windows box crashed...
                  ...oh, wait a minute -- he already does.


********************** /usr/dict/words test **********************

Trying with   45402 words


-------------[Benchmarking tea hash]-------------

Collisions : 45
Dangerous :       1      ffff980
Timing :

real     0m0.145s
user     0m0.120s
sys      0m0.010s

-------------[Benchmarking rupasov hash]-------------

Collisions : 553
Dangerous :       1      ffffe00
Timing :

real     0m0.297s
user     0m0.260s
sys      0m0.020s

-------------[Benchmarking r5 hash]-------------

Collisions : 185
Dangerous :       1      ffae000
Timing :

real     0m0.124s
user     0m0.080s
sys      0m0.030s

-------------[Benchmarking r7 hash]-------------

Collisions : 2528
Dangerous :       1      fffd400
Timing :

real     0m0.121s
user     0m0.100s
sys      0m0.000s

-------------[Benchmarking jedi hash]-------------

Collisions : 54
Dangerous :       1      fff9780
Timing :

real     0m0.122s
user     0m0.100s
sys      0m0.010s

-------------[Benchmarking jedi2 hash]-------------

Collisions : 93
Dangerous :       1      fff9780
Timing :

real     0m0.122s
user     0m0.090s
sys      0m0.020s

-------------[Benchmarking lookup2 hash]-------------

Collisions : 63
Dangerous :       1      ffff480
Timing :

real     0m0.123s
user     0m0.100s
sys      0m0.000s

********************** Squid names test **********************

Trying with  458752 squid cache entries

-------------[Benchmarking tea hash]-------------

Collisions : 6237
Dangerous :       1      fffff80
Timing :

real     0m1.138s
user     0m1.090s
sys      0m0.030s

-------------[Benchmarking rupasov hash]-------------

Collisions : 377520
Dangerous :       1      e32700
Timing :

real     0m2.588s
user     0m2.550s
sys      0m0.020s

-------------[Benchmarking r5 hash]-------------

Collisions : 309991
Dangerous :       1      55406b80
Timing :

real     0m0.940s
user     0m0.880s
sys      0m0.040s

-------------[Benchmarking r7 hash]-------------

Collisions : 449006
Dangerous :       2      22b16580
Timing :

real     0m0.928s
user     0m0.840s
sys      0m0.070s

-------------[Benchmarking jedi hash]-------------

Collisions : 2771
Dangerous :       1      fffef80
Timing :

real     0m0.928s
user     0m0.860s
sys      0m0.050s

-------------[Benchmarking jedi2 hash]-------------

Collisions : 0
Dangerous :       1      ffff80
Timing :

real     0m0.879s
user     0m0.810s
sys      0m0.050s

-------------[Benchmarking lookup2 hash]-------------

Collisions : 6203
Dangerous :       1      fffdc00
Timing :

real     0m0.930s
user     0m0.840s
sys      0m0.080s

********************** Real names test **********************

Trying with   89830 files

-------------[Benchmarking tea hash]-------------

Collisions : 237
Dangerous :       1      fff5580
Timing :

real     0m0.276s
user     0m0.250s
sys      0m0.000s

-------------[Benchmarking rupasov hash]-------------

Collisions : 6288
Dangerous :       1      ffee080
Timing :

real     0m0.582s
user     0m0.560s
sys      0m0.010s

-------------[Benchmarking r5 hash]-------------

Collisions : 3920
Dangerous :       1      fff4600
Timing :

real     0m0.230s
user     0m0.190s
sys      0m0.020s

-------------[Benchmarking r7 hash]-------------

Collisions : 11801
Dangerous :       1      fff580
Timing :

real     0m0.225s
user     0m0.180s
sys      0m0.030s

-------------[Benchmarking jedi hash]-------------

Collisions : 269
Dangerous :       1      fff9f80
Timing :

real     0m0.226s
user     0m0.200s
sys      0m0.010s

-------------[Benchmarking jedi2 hash]-------------

Collisions : 415
Dangerous :       1      fff9f80
Timing :

real     0m0.225s
user     0m0.200s
sys      0m0.010s

-------------[Benchmarking lookup2 hash]-------------

Collisions : 223
Dangerous :       1      ffff480
Timing :

real     0m0.230s
user     0m0.210s
sys      0m0.000s

----------------------------------------------------------------------------------------

body { font-family: "helvetica" } p { font-size: 12pt } a { color: #0000ff; 
text-decoration: none; }(reiserfs) hash torture results
From: Petru Paler <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]


Machine: AMD Athlon/650MHz, 128Mb RAM, Quantum Fireball lct15 IDE hdd
(UDMA/66 but that doesn't matter). Kernel 2.4.0-test1-ac10.

The results are interesting, but more interesting would be to see how fast
reiserfs actually is with each of these hashes.

Script output:

********************** /usr/dict/words test **********************

Trying with   45402 words


-------------[Benchmarking tea hash]-------------

Collisions : 45
Dangerous :       1      ffff980
Timing :
0.00user 0.01system 0:00.08elapsed 11%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking rupasov hash]-------------

Collisions : 553
Dangerous :       1      ffffe00
Timing :
0.00user 0.00system 0:00.18elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking r5 hash]-------------

Collisions : 185
Dangerous :       1      ffae000
Timing :
0.00user 0.00system 0:00.08elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking r7 hash]-------------

Collisions : 2528
Dangerous :       1      fffd400
Timing :
0.00user 0.01system 0:00.07elapsed 12%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking jedi hash]-------------

Collisions : 54
Dangerous :       1      fff9780
Timing :
0.00user 0.00system 0:00.08elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking jedi2 hash]-------------

Collisions : 93
Dangerous :       1      fff9780
Timing :
0.00user 0.00system 0:00.07elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking lookup2 hash]-------------

Collisions : 63
Dangerous :       1      ffff480
Timing :
0.00user 0.00system 0:00.07elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

********************** Squid names test **********************

Trying with  262144 squid cache entries

-------------[Benchmarking tea hash]-------------

Collisions : 2019
Dangerous :       1      ffff880
Timing :
0.00user 0.01system 0:00.47elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking rupasov hash]-------------

Collisions : 210912
Dangerous :       1      a88f00
Timing :
0.00user 0.02system 0:01.03elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking r5 hash]-------------

Collisions : 171912
Dangerous :       1      54ca7680
Timing :
0.00user 0.03system 0:00.41elapsed 7%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking r7 hash]-------------

Collisions : 256171
Dangerous :       6      22aa0600
Timing :
0.00user 0.03system 0:00.41elapsed 7%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking jedi hash]-------------

Collisions : 589
Dangerous :       1      fffda00
Timing :
0.00user 0.02system 0:00.42elapsed 4%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking jedi2 hash]-------------

Collisions : 0
Dangerous :       1      ffff80
Timing :
0.00user 0.00system 0:00.40elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking lookup2 hash]-------------

Collisions : 2041
Dangerous :       1      fffdc00
Timing :
0.00user 0.01system 0:00.40elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

********************** Real names test **********************

find: /proc/31112/fd/4: No such file or directory
Trying with   94836 files

-------------[Benchmarking tea hash]-------------

Collisions : 235
Dangerous :       1      fff5e80
Timing :
0.00user 0.00system 0:00.20elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking rupasov hash]-------------

Collisions : 2016
Dangerous :       1      fffab80
Timing :
0.01user 0.00system 0:00.46elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking r5 hash]-------------

Collisions : 495
Dangerous :       1      fff8780
Timing :
0.00user 0.00system 0:00.17elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking r7 hash]-------------

Collisions : 8162
Dangerous :       1      fff580
Timing :
0.00user 0.02system 0:00.17elapsed 11%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking jedi hash]-------------

Collisions : 331
Dangerous :       1      ffe400
Timing :
0.00user 0.00system 0:00.17elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking jedi2 hash]-------------

Collisions : 341
Dangerous :       1      ffe400
Timing :
0.00user 0.00system 0:00.17elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-------------[Benchmarking lookup2 hash]-------------

Collisions : 298
Dangerous :       1      fffb700
Timing :
0.00user 0.00system 0:00.17elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (83major+13minor)pagefaults 0swaps

-Petru

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to