Re: Lucene outperforms MySQL, BerkeleyDB, and PostgreSQL for genome map searches

2005-09-02 Thread Don Gilbert
Chris, The limitation on huge genome map ranges is for display software (putting all those features into an image a person can understand). 500Kb is about an average viewable size, though some uses will draw on 1-10 Mb of data. I used a real-world test case that will be directly applicable to h

Re: Lucene outperforms MySQL, BerkeleyDB, and PostgreSQL for genome map searches

2005-09-02 Thread Chris Lu
Another great write up! One more question, "100 millions of possible locations." looks large. But 500Kb is relatively small compared to Lucene's capability. And I believe the performance difference between Lucene and MySql, or others, are much obvious at 1GB~10GB level. I heard numbers like 10x fa

Lucene outperforms MySQL, BerkeleyDB, and PostgreSQL for genome map searches

2005-09-02 Thread Don Gilbert
Lucene outperforms MySQL, BerkeleyDB, and PostgreSQL for genome map database searches. GBrowse (Generic Genome Browser, http://www.gmod.org/) is a widely used program for displaying maps of genome data in biology/bioinformatics. One need it serves is helping biologists quickly and easily locate

Re: Secure Server index

2005-09-02 Thread Otis Gospodnetic
Hello, Valmir, I'm replying to java-user@ list, as it's better for this type of questions. Lucene has no built-in security features. Some people use Lucene's Filters to accomplish show/hide-type security. If you have a copy of the Lucene book, you'll see that covered in section 5.5.3 ( http://w

RE: Ideal Index Fragmentation

2005-09-02 Thread Chris Hostetter
: --OK, is there a preferred strategy for generating lists of distinct : attributes in the hit[]? I've seen Hoss' post about using QueryFilters, : but that assumes that you know what values you want to count; but I : won't know the domain of values to expect in every field... Can I get : creativ

Re: Did you mean?

2005-09-02 Thread Chris Lu
On 9/2/05, Paul Libbrecht <[EMAIL PROTECTED]> wrote: > Isn't this relatively easily done using current indexReader methods? > My 2p would be (I intended to do it): > - have each of your words get analyzed in each flavour (eg stemmed, > phonetic...) > - get a tokens in each flavour and compare to th

Phrase frequency

2005-09-02 Thread Fabio Cristiano dos Anjos
Hi, How can I get phrase frequency in an index? Thanks in advance!! -- Atenciosamente, Fábio Cristiano dos Anjos

Re: Did you mean?

2005-09-02 Thread Paul Libbrecht
Isn't this relatively easily done using current indexReader methods? My 2p would be (I intended to do it): - have each of your words get analyzed in each flavour (eg stemmed, phonetic...) - get a tokens in each flavour and compare to that - map back (that's the part I haven't done yet). This is

RE: Ideal Index Fragmentation

2005-09-02 Thread Friedland, Zachary (EDS - Strategy)
Follow-up questions below denoted with "--" Thanks, Zach -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 31, 2005 12:25 PM To: java-user@lucene.apache.org Subject: Re: Ideal Index Fragmentation On Aug 30, 2005, at 9:53 PM, Friedland, Zachary (EDS

[ANN] Roosster personal search engine 0.5 released

2005-09-02 Thread Benjamin Reitzammer
Hi, I just released the first really usable version of roosster, a cross-platform open source personal search engine which uses Lucene to power it's full-text search. Read the full announcement at http://roosster.org/dev/news/release05.html and check it out. Or take a look at the screenshots, to g