Could you index the reverse of the string so the unique part appears first?
On May 14, 2011, at 11:20, InterRob <rob.mar...@gmail.com> wrote: > Dear list, > > I would be pleased if you could share some thoughts with me on the following: > say I wish to maintain a table with all distinct character sequences > (variable length) showing series with strong similarities. Example: > "abbbabacccdef" > "abbbabaccdcdf" > "abbbabaccdcgf" > ... > "qwtrhdffdd" > ... > "qwtrhdffdds" > ... > "qwtrhdffddsspp" > "qwtrhdffddsspf" > "qwtrhdffddssph" > "qwtrhdffddsspL" > etc. > > Think of them as ordered values (array-like), as a set having many values in > common, in the same "elements" (that is: positions; in my application, each > position has some particular meaning -- the sequence represents a set of > particular settings) > > CREATE TABLE textseqs(txtseq TEXT) > > What would be an efficient approach in enforcing a UNIQUE constraint? > > I was thinking of using hashbuckets in a b-tree: > CREATE UNIQUE INDEX ON textseqs USING BTREE( hashtext(txtseq), txtseq ) > > This index would "cache" hashes for each row. Upon inserting of a new row, > traversing the index involves the comparison of two single integers for each > node in the b-tree, until the actual hash value (if it exists) was reached. > Then, only within that bucket (the hashes won't be unique), the more > expensive string comparing is required; involving a sequencial comparison of > (potentially maaaany) characters. Yet, within that bucket, the character > series may be expected to show stronger differences than a plain sorted list > of all values in the table would have, indexed by a b-tree index. Wouldn't > traversing such a plain (non-composite, single column, on: "txtseq") b-tree > index involve a sequencial comparison of (potentially many) characters *at > EACH NODE* of the tree ? Or am I mistaken that each node is filled with > actual values from the txtseq column? > > Thank you for your input! > > Cheers, > Rob -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general