I have a working version of a text search engine. I want to make it work
for Postgres (I will be releasing it GPL). It can literally find the
occurrence of a string of words within 5 million records in a few
milliseconds. It is very fast, it works similarly to many web search
engines.

I have tried many approaches to integrate the search system with
Postgres, but I can't find any method that isn't too slow or too
cumbersome.

The best I have been able to come up with is this:

create function textsearch(varchar) returns integer as
        '
        DECLARE
                handle  integer;
                count   integer;
                pos     integer;
        BEGIN
                handle = search_exec( \'localhost\', $1);
 
                count = search_count(handle);
 
                for pos in 0 .. count-1 loop
                        insert into search_result(key, rank)
                        values (search_key(handle,pos),
search_rank(handle,pos));
                end loop;
 
                return search_done(handle);
 
        END;
' language 'plpgsql';  

And this is used as:

create temp table search_result (key integer, rank integer);

select textsearch('bla bla');

select field from table where field_key = search_result.key order by
search_result.rank ;

drop table search_result ;  


The problems with this are, I can't seem to be able to create a table in
plpgsql. (I read about a patch, but have to find out what version it is
in), so I have to create a table outside the function.

I can only execute one text search, because I can't seem to use the name
of a table that has been passed in to the plpgsql environment, that
would allow multiple searches to be joined. As:

select textsearch(temp_tbl1, 'bla bla');
select textsearch(temp_tbl2, 'foo bar');

select field from table1, table2 where table1.field_key = temp_tbl1.key
and  table2.field_key = temp_tbl2.key;


This could be so sweet, but, right now, it is just a disaster and I am
pulling my hair out. Does anyone have any suggestions or tricks that
could make this easier/faster, or is Postgres just unable to do this
sort of thing.


-- 
http://www.mohawksoft.com

Reply via email to