On Fri, Feb 8, 2019 at 7:42 PM Michael Jones <michael.jo...@gmail.com> wrote:
> clustering:
> https://www.cs.cornell.edu/courses/cs3110/2014fa/lectures/13/lec13.html
>
> careful hash functions often treat short inputs specially.
>
> iterated shift-xor alone is weak in expanding the "changed bits(s)" signal, 
> at least by comparison to a) large prime multiply, b) good s-boxes, c) 
> introduction of keyed material.
Hm, thanks. I would like to try a particular version of this prime
multiplication idea wih utf8 strings.
I wrote for loops to find collisions (attached). It takes 3 seconds
and finds no collision for strings with length < 4.
The next step (including length=4, commented in the code) will take
13+ min and 32+ gb ram, so I appreciate if anyone with sufficient RAM
could try that and report the result ;P
Thanks..

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
package main

import "fmt"

func Txt2int(s string) uint64 {
	x := uint64(len(s))
	for i := len(s) - 1; i >= 0; i-- {
		x *= 11400714819323198549
		x += uint64(s[i])
	}
	return x
}

const N = 16646656 // 4244897281

var mp = make(map[uint64]bool, N)
var bf = []byte("abcd")

func main() {
	mp[0] = true

	for i := 255; i > 0; i-- {
		bf[0] = byte(i)
		mp[Txt2int(string(bf[:1]))] = true

		for k := 255; k > 0; k-- {
			bf[1] = byte(k)
			mp[Txt2int(string(bf[:2]))] = true

			for r := 255; r > 0; r-- {
				bf[2] = byte(r)
				mp[Txt2int(string(bf[:3]))] = true

				/*	for s := 255; s > 0; s-- {
					bf[3] = byte(s)
					mp[Txt2int(string(bf[:4]))] = true
				} */
			}
		}
	}
	fmt.Println(N - len(mp))
}

Reply via email to