=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulc...@wulczer.org> writes: > [ e of ] s/2 or s/3 look reasonable.
The examples in the LC paper seem to all use e = s/10. Note the stated assumption e << s. > So, should I just write a patch that sets the bucket width and pruning > count using 0.07 as the assumed frequency of the most common word and > epsilon equal to s/2 or s/3? I'd go with s = 0.07 / desired-MCE-count and e = s / 10, at least for a first cut to experiment with. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers