I Need System Design Suggestion. Please.

2005-11-04 Thread Victor Lee
Hi, I am going to use mysql db to store some data, use lucene(java) to index these data, and use Hibernate to map them. I was originally thinking of using PHP to input the data the visitors enter into the mysql db. But if I use PHP and use mysql statement directly, it may defeat the part of the pur

Re: Searching the contents

2005-11-04 Thread Chris Hostetter
: Does it possible to retrive the data order/sort by they were inserted in the : index but without putting any extra column in the document? The first step anyone should take in understanding Lucene, is to start by spending a lot of time reading over the javadocs. If you look at the Sort class,

Re: Multiple terms with the same position in PhraseQuery

2005-11-04 Thread Ahmed El-dawy
Thanks for your reply. Yes, the problem is not with the QueryParser, I build the query with code. Of course the analyzer is involved in building the query, but I didn't mention it in the code for simplicity. I saw the MultiPhraseQuery. Its name in my version is PhrasePrefixQuery. Why don't we use t

Re: Searching the contents

2005-11-04 Thread Manoj Kr. Sheoran
Hi Hoss and All, I am happy with your suggestion. Does it possible to retrive the data order/sort by they were inserted in the index but without putting any extra column in the document? Regards, Manoj - Original Message - From: "Chris Hostetter" <[EMAIL PROTECTED]> To: Sent: Friday, Nov

Re: SpanQuery parser? Update (ugly hack inside...)

2005-11-04 Thread Erik Hatcher
On 4 Nov 2005, at 18:32, Sean O'Connor wrote: I'm posting this primarily hoping to give back a tiny bit to a very helpful community. More likely however, someone else will open my eyes to an easier approach than what I outline below... I've come up with a very ugly conversion approach from

Re: SpanQuery parser? Update (ugly hack inside...)

2005-11-04 Thread Sean O'Connor
I'm posting this primarily hoping to give back a tiny bit to a very helpful community. More likely however, someone else will open my eyes to an easier approach than what I outline below... I've come up with a very ugly conversion approach from regular Query objects into SpanQuery objects. I t

Re: Multiple terms with the same position in PhraseQuery

2005-11-04 Thread Chris Hostetter
I must admit, I have not tried running your test, but based on reading it, I think you are missunderstanding what's happening here. (or perhaps I am.) You initialy stated that you were having a problem because your Analyzer outputs multiple tokens at the same position, and your phrase queries we

Re: Multiple terms with the same position in PhraseQuery

2005-11-04 Thread Ahmed El-dawy
This is a source code that shows the problem I am talking about. In this example a new analyzer is made that outputs all words to the same position (all but the first one are positionIncrement=0). To get the problem I am talking about uncomment the only commented line. //---

Re: Multiple terms with the same position in PhraseQuery

2005-11-04 Thread Erik Hatcher
On 4 Nov 2005, at 13:45, Daniel Naber wrote: On Freitag 04 November 2005 11:33, Erik Hatcher wrote: This should have been fixed one year ago with Daniel and myself. Really? It works in this OR kind of fashion with tokens in 0- incremented positions? Yes, this test case shows it (multi

Re: Scoring formula

2005-11-04 Thread Otis Gospodnetic
The formula should also be in the javadoc for Similarity class, if it was there in 1.2. Otis --- Karl Koch <[EMAIL PROTECTED]> wrote: > Hello group, > > the scoring formula for Lucene is well explained in "Lucene in > Action". > However, is this formula also valid for Lucene 1.2 (which I am >

Scoring formula

2005-11-04 Thread Karl Koch
Hello group, the scoring formula for Lucene is well explained in "Lucene in Action". However, is this formula also valid for Lucene 1.2 (which I am using). I need to know that for documentation purposes. If not, where can I find the currect formula since I do not want to interpret if from the code

Re: Multiple terms with the same position in PhraseQuery

2005-11-04 Thread Daniel Naber
On Freitag 04 November 2005 11:33, Erik Hatcher wrote: > > This should have been fixed one year ago with Daniel and myself. > > Really?  It works in this OR kind of fashion with tokens in 0- > incremented positions? Yes, this test case shows it (multi will be turned into multi and multi2, both a

Re: Searching the contents

2005-11-04 Thread Chris Hostetter
: > Lucene searching system. How do you manages the iterator and what : > is the : > method callback at query execution time(a broader view). : : There really isn't any method callback, at least not in the way I'm : thinking of it. When you search you get back Hits. Hits is an If you want a lo

[OT ANN] Roomity v 1.5 w/ video, social networking and html editor + JDNC article

2005-11-04 Thread netsql
Roomity.com v 1.5 is a web 2.01/RiA poster child community webapp. This new version ads broadcast video, social networking such as favorite authors and html editor. It likely already has groups and content you are already using but aggregated and safer, including technology, Java, etc., but it o

Re: Multiple terms with the same position in PhraseQuery

2005-11-04 Thread Erik Hatcher
On 4 Nov 2005, at 03:53, Pierrick Brihaye wrote: Hi Ahmed and all, Ahmed El-dawy wrote: Hello, My analyzer sometimes gives multiple terms for the same word. This makes them generated at the same position. When I use PhraseQuery to search for this term, it matches documents with all these

Re: Searching + Sorting in 3 milion documents

2005-11-04 Thread Erik Hatcher
On 4 Nov 2005, at 02:50, Manoj Kr. Sheoran wrote: - state of the index (optimized vs. unoptimized) Which one will be best for these sort of scenario? Optimized? Optimized is always best for better search efficiency. But practically speaking, how optimized an index is depends on how you ne

Re: Searching the contents

2005-11-04 Thread Erik Hatcher
On 4 Nov 2005, at 05:23, Manoj Kr. Sheoran wrote: Hi Erik and all, Thanks for the concern. We will test and let you know it also. We would really appreciate if any of you can tell some more about the architecture of Lucene searching system. How do you manages the iterator and what is the

Re: Searching the contents

2005-11-04 Thread Manoj Kr. Sheoran
Hi Erik and all, Thanks for the concern. We will test and let you know it also. We would really appreciate if any of you can tell some more about the architecture of Lucene searching system. How do you manages the iterator and what is the method callback at query execution time(a broader view). R

Re: Searching the contents

2005-11-04 Thread Erik Hatcher
I certainly recommend testing this to see what kind of response times you get for the first and successive searches after the caches are built - be sure to use the same IndexReader for all searches to benefit from caching :) Sorting on 4-5 columns seems kind of extreme. Sorting uses up RAM

Re: Multiple terms with the same position in PhraseQuery

2005-11-04 Thread Pierrick Brihaye
Hi Ahmed and all, Ahmed El-dawy wrote: Hello, My analyzer sometimes gives multiple terms for the same word. This makes them generated at the same position. When I use PhraseQuery to search for this term, it matches documents with all these terms at the same position (as if it is an AND). I wa

Multiple terms with the same position in PhraseQuery

2005-11-04 Thread Ahmed El-dawy
Hello, My analyzer sometimes gives multiple terms for the same word. This makes them generated at the same position. When I use PhraseQuery to search for this term, it matches documents with all these terms at the same position (as if it is an AND). I want it to match documents with at least ONE