realtime indexing

2007-11-15 Thread John Wang
Hi: It was interesting hearing about the need for real time indexing at the BirdsOfAFeather round table. We also needed to solve this problem. We took this approach: A large disk index that indexes in batch, e.g. sleeps for some time queue up requests, wakes up and the index. While large disk

RE: Stack Trace - RE: gdata server

2007-11-15 Thread Lyth, Christopher [USA]
Well as far as I could tell the config xml files should work out of version control. All I have done so far is check out the project and tried to build it. The ant files seem to need some attention. Anyway, I was hoping to find someone that was using it or had ever set it up. This project is very

Re: gdata server

2007-11-15 Thread Grant Ingersoll
Hmm, we're not sure. We were just asking if anyone uses it b/c we are thinking of archiving it. I guess, though, you are saying that you would like to use it. :-) -Grant On Nov 15, 2007, at 5:31 PM, Lyth, Christopher [USA] wrote: Is anyone on this list using the gdata server? I have been

Re: Stack Trace - RE: gdata server

2007-11-15 Thread Grant Ingersoll
Is there a configuration file you have to setup? Can you give info on what commands you ran? I have never used GData, but this error looks like it is trying to configure something and it is not getting the class it expects. -Grant On Nov 15, 2007, at 7:46 PM, Lyth, Christopher [USA] wrot

Stack Trace - RE: gdata server

2007-11-15 Thread Lyth, Christopher [USA]
Nov 15, 2007 7:40:39 PM org.apache.lucene.gdata.server.registry.GDataServerRegistry registerScopeVisitor INFO: Register scope visitor -- class org.apache.lucene.gdata.server.registry.ProvidedServiceConfig Nov 15, 2007 7:40:39 PM org.apache.commons.digester.Digester endElement SEVERE: End event thre

gdata server

2007-11-15 Thread Lyth, Christopher [USA]
Is anyone on this list using the gdata server? I have been trying to get it working and have been running into some problems.

Re: Indexing Problem

2007-11-15 Thread Sirish
Wow!! Thanks dude... that works... I have spent almost a day figuring out the issue... I appreciate it!! Erick Erickson wrote: > > Your problem is probably, that by default, Lucene stops after > 10,000 terms. See IndexWriter.SetMaxFieldLength > > Best > Erick > > On Nov 15, 2007 1:42 PM, Siris

Re: 答复: how to effeciently implement th e stastical scores like pagerank?

2007-11-15 Thread Michael Busch
John Wang wrote: > Would payload work? > -John > > Yes, if you used payloads instead of stored fields your performance should be much better. Try and index one special term per document (e. g. score:pagerank), and index one position with a payload for each doc. Then when you retrieve hits open

Re: Indexing Problem

2007-11-15 Thread Erick Erickson
Your problem is probably, that by default, Lucene stops after 10,000 terms. See IndexWriter.SetMaxFieldLength Best Erick On Nov 15, 2007 1:42 PM, Sirish <[EMAIL PROTECTED]> wrote: > > The following is my code snippet for indexing the text: > > document.add(Field.Text(IFIELD_TEXT, billMeasureDoc.

Indexing Problem

2007-11-15 Thread Sirish
The following is my code snippet for indexing the text: document.add(Field.Text(IFIELD_TEXT, billMeasureDoc.getText())); When ever the text is less or short, it works perfectly. But in few of the cases if the text is too lengthy; i.e. around 1000 lines or more then it causes a problem. The prob

Re: get original term for synonym

2007-11-15 Thread Matthijs Bierman
Hi Mark, I have solved it in another way now. I've created my own implementation of StandardAnalyzer (which I've called AdvancedAnalyzer). This analyzer keeps the word "zone-indeling" together, so users can simply search for this term and it will be highlighted exactly as is. These compound wo

Re: 答复: how to effeciently implement the stastical scores like pagerank?

2007-11-15 Thread John Wang
Would payload work? -John On 11/15/07, Zhou Qi <[EMAIL PROTECTED]> wrote: > > Thank you, my score is fixed score from the properties of the page, but at > first we need to adjust the score for a promising result. > I have tried one way of manually re-ranking all the documents by the > search resul

答复: how to effeciently implement the stastical scores like pager ank?

2007-11-15 Thread Zhou Qi
Thank you, my score is fixed score from the properties of the page, but at first we need to adjust the score for a promising result. I have tried one way of manually re-ranking all the documents by the search results. But it needs to iterate all the retrieved results and fetch the re-ranking sco

Re: lucene datatypes

2007-11-15 Thread Grant Ingersoll
Solr provides semantics on Lucene fields for handling other data types, and there are some tools (DateTools, NumberTools) for converting some types to Strings for searching. But yeah, Strings are pretty much the only thing Lucene cares about when it comes to searching. -Grant On Nov 15,