RE: how to statistics categories amount

2008-07-02 Thread lutan
Anyone explain solr's function of facet ,thanks! How to using lucene to achieve. > From: [EMAIL PROTECTED]> To: java-user@lucene.apache.org> Subject: Re: how to statistics categories amount> Date: Sat, 28 Jun 2008 05:36:12 -0400> > > On Jun 28, 2008, at 3:57 AM, lutan wrote:> > if I searc

Re: Lucene Error : java.io.FileNotFoundException

2008-07-02 Thread yugana
I haven't set the path in the configuration file. I have hardcoded the locations. //the directory that stores html files private final String dataDir = "d:\\dataDir"; //the directory that is used to store lucene index private final String indexDir = "d:\\indexDir"; saikrishna

Re: Lucene Error : java.io.FileNotFoundException

2008-07-02 Thread saikrishna venkata pendyala
Please check the path set for lucene-index in configuration file. On Thu, Jul 3, 2008 at 10:11 AM, yugana <[EMAIL PROTECTED]> wrote: > > Hi, > > I am indexing content and searching using lucene. It is working fine when I > use the simple servlet and jsp mechanism. I am able to search on the > ind

Lucene Error : java.io.FileNotFoundException

2008-07-02 Thread yugana
Hi, I am indexing content and searching using lucene. It is working fine when I use the simple servlet and jsp mechanism. I am able to search on the indexed content. I tried to implement the same using JBoss Portal. When I try to run the search, I get the below error: Please help me to resolve th

Re: Match all documents with non empty field

2008-07-02 Thread Mark Miller
Daniel Noll wrote: Patrick wrote: Hi, Can't seem to wrap my head around how to go about it. I want to retrieve all documents where a certain field in not empty. What would be the best way to do it? The most trivial way would be to use a PrefixQuery with an empty string. It won't be efficie

Re: Match all documents with non empty field

2008-07-02 Thread Daniel Noll
Patrick wrote: Hi, Can't seem to wrap my head around how to go about it. I want to retrieve all documents where a certain field in not empty. What would be the best way to do it? The most trivial way would be to use a PrefixQuery with an empty string. It won't be efficient unless you wrap i

Re: Match all documents with non empty field

2008-07-02 Thread Erick Erickson
You can certainly use a filter and MatchAllDocs. You can also index a special value for the field in question (nothere) and combine MatchAllDocs with a NOT field:nothere or some such. Best Erick On Wed, Jul 2, 2008 at 5:25 PM, Patrick <[EMAIL PROTECTED]> wrote: > Hi, > > Can't seem to wrap my he

Match all documents with non empty field

2008-07-02 Thread Patrick
Hi, Can't seem to wrap my head around how to go about it. I want to retrieve all documents where a certain field in not empty. What would be the best way to do it? Should I search with a MatchAllDocQuery and a Filter? Should I go through all terms in the field and create a TermQuery with it?

Re: Do Lucene Deletes delete the physical file? If yes, is there a way not to?

2008-07-02 Thread Karl Wettin
2 jul 2008 kl. 19.59 skrev David Lee: Is it possible to delete a document from the index, but not the physical file. And also I'm wondering what the functions are that will alter the physical files being indexed. Documents are not deleted until you optimize the index. Perhaps they are d

Do Lucene Deletes delete the physical file? If yes, is there a way not to?

2008-07-02 Thread David Lee
Is it possible to delete a document from the index, but not the physical file. And also I'm wondering what the functions are that will alter the physical files being indexed. On a side note: what is the best way to look up information like this so I don't have to bug the java-user mailing list for

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Michael McCandless
OK I opened this one: https://issues.apache.org/jira/browse/LUCENE-1325 Mike Shalin Shekhar Mangar wrote: That's great. Thanks! On Wed, Jul 2, 2008 at 6:04 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: OK I think that makes sense. I'll take it. I'll add an isOptimized() to In

Re: Incorrect Token Offset when using multiple fieldable instance

2008-07-02 Thread Michael McCandless
Toph wrote: Michael McCandless-2 wrote: We could alternatively extend TokenStream so you could query it for the final offset, then fix indexing to use that value instead of the endOffset of the last token that it saw. Querying the tokenstream for the final offset would good, but then w

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Shalin Shekhar Mangar
That's great. Thanks! On Wed, Jul 2, 2008 at 6:04 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: > > OK I think that makes sense. I'll take it. I'll add an isOptimized() to > IndexCommit. > > Mike > > Shalin Shekhar Mangar wrote: > >> Ok, so there is no reliable way which can work across rele

Re: Incorrect Token Offset when using multiple fieldable instance

2008-07-02 Thread Toph
Michael McCandless-2 wrote: > > > This would actually be a fairly large change: it's a change to the > index format and all APIs that handle offsets during indexing & > searching/retrieving. > > For now I just changed the offset calculation in DocumentWriter as specified here by the OP:

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Michael McCandless
OK I think that makes sense. I'll take it. I'll add an isOptimized() to IndexCommit. Mike Shalin Shekhar Mangar wrote: Ok, so there is no reliable way which can work across releases. Actually, we are implementing replication feature for Solr (SOLR-561) and we'd like the user to configure

Re: Incorrect Token Offset when using multiple fieldable instance

2008-07-02 Thread Michael McCandless
This would actually be a fairly large change: it's a change to the index format and all APIs that handle offsets during indexing & searching/retrieving. We could alternatively extend TokenStream so you could query it for the final offset, then fix indexing to use that value instead of the

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Shalin Shekhar Mangar
Ok, so there is no reliable way which can work across releases. Actually, we are implementing replication feature for Solr (SOLR-561) and we'd like the user to configure a replication/snapshoot on commit or only on optimize. We want to rely on IndexDeletionPolicy to avoid copying index as snapshot

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Michael McCandless
Well ... that heuristic is not quite general enough, because the completion of a merge would also decrease the # files and +1 the generation number (if a commit had occurred). You could check for *.cfs and if there is only one, declare the index optimized? This still isn't always correct

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Shalin Shekhar Mangar
Hi Michael, Thanks for the response. Looking at the general way the filenames are organized: IndexCommit.getFileNames() without optimize (after IW.close()) [segments_4, _0.cfs, _1.cfs, _2.cfs] IndexCommit.getFileNames() after optimize+close [segments_5, _4.cfs] We can compare the latest commit