Could you index the reverse of the string so the unique part appears first?

On May 14, 2011, at 11:20, InterRob <rob.mar...@gmail.com> wrote:

> Dear list,
> 
> I would be pleased if you could share some thoughts with me on the following: 
> say I wish to maintain a table with all distinct character sequences 
> (variable length) showing series with strong similarities. Example:
> "abbbabacccdef"
> "abbbabaccdcdf"
> "abbbabaccdcgf"
> ...
> "qwtrhdffdd"
> ...
> "qwtrhdffdds"
> ...
> "qwtrhdffddsspp"
> "qwtrhdffddsspf"
> "qwtrhdffddssph"
> "qwtrhdffddsspL"
> etc.
> 
> Think of them as ordered values (array-like), as a set having many values in 
> common, in the same "elements" (that is: positions; in my application, each 
> position has some particular meaning -- the sequence represents a set of 
> particular settings)
> 
> CREATE TABLE textseqs(txtseq TEXT)
> 
> What would be an efficient approach in enforcing a UNIQUE constraint?
> 
> I was thinking of using hashbuckets in a b-tree:
> CREATE UNIQUE INDEX ON textseqs USING BTREE( hashtext(txtseq), txtseq )
> 
> This index would "cache" hashes for each row. Upon inserting of a new row, 
> traversing the index involves the comparison of two single integers for each 
> node in the b-tree, until the actual hash value (if it exists) was reached. 
> Then, only within that bucket (the hashes won't be unique), the more 
> expensive string comparing is required; involving a sequencial comparison of 
> (potentially maaaany) characters. Yet, within that bucket, the character 
> series may be expected to show stronger differences than a plain sorted list 
> of all values in the table would have, indexed by a b-tree index. Wouldn't 
> traversing such a plain (non-composite, single column, on: "txtseq") b-tree 
> index involve a sequencial comparison of (potentially many) characters *at 
> EACH NODE* of the tree ? Or am I mistaken that each node is filled with 
> actual values from the txtseq column?
> 
> Thank you for your input!
> 
> Cheers,
> Rob

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to