Re: Possible to find min and max values for a Date field?

2005-05-30 Thread Chris Hostetter
: I guess I could use TermEnum to do a binary search until I get a hit but : this seems a bit kludgy. min is easy, it's the first term in the enum. max ... well, the simplest way i can think of to find the max of a string field is to use a StringIndex, something like... String[] s = FieldCach

Possible to find min and max values for a Date field?

2005-05-30 Thread Kevin Burton
Is it possible to find the minimum and maximum values for a date field with a given reader? I guess I could use TermEnum to do a binary search until I get a hit but this seems a bit kludgy. Thoughts? I don't see any APIs for doing this and a google/grep of the source doesn't help Kevin -

managing docids for ParallelReader (was Augmenting an existing index)

2005-05-30 Thread Matt Quail
I have a similar problem, for which ParallelReader looks like a good solution -- except for the problem of creating a set of indices with matching document numbers. I have wondered about this as well. Are there any *sure fire* ways of creating (and updating) two indices so that doc numbers in

Re: How to build 1.9-rc1 sandbox?

2005-05-30 Thread Daniel Naber
On Monday 30 May 2005 22:57, Andrew Boyd wrote: > In trunck/java I've built ant compile-core Just calling "ant" (in the top-level directory) without any options should build this file: build/lucene-core-1.9-rc1-dev.jar -- http://www.danielnaber.de -

How to build 1.9-rc1 sandbox?

2005-05-30 Thread Andrew Boyd
In trunck/java I've built ant compile-core changing to trunck/java/contrib the only target is build-tree but I get a lot of compiler errors such as package org.apache.lucene.analysis does not exist. Anyone know what I'm doing wrong? Thanks, Andrew ---

Augmenting an existing index (was: ACLs and Lucene)

2005-05-30 Thread Sebastian Marius Kirsch
Hello, I have a similar problem, for which ParallelReader looks like a good solution -- except for the problem of creating a set of indices with matching document numbers. I want to augment the documents in an existing index with information that can be extracted from the same index. (Basically,

RE: Indexing multiple keywords in one field?

2005-05-30 Thread Doug Hughes
Ok, so more than one keyword can be stored in a keyword field. Interesting! Thanks, Doug -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Monday, May 30, 2005 3:39 PM To: java-user@lucene.apache.org Subject: Re: Indexing multiple keywords in one field? On May 3

RE: Indexing multiple keywords in one field?

2005-05-30 Thread Doug Hughes
Ok, so more than one keyword can be stored in a keyword field. Interesting! Thanks, Doug -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Monday, May 30, 2005 3:39 PM To: java-user@lucene.apache.org Subject: Re: Indexing multiple keywords in one field? On May 3

Re: Indexing multiple keywords in one field?

2005-05-30 Thread Erik Hatcher
On May 30, 2005, at 2:06 PM, Doug Hughes wrote: Hoss, I see what you're saying, but that seems primarily beneficial when you know the structure and size of your data ahead of time. For instance, any of the HTML documents I'm indexing can have any number of links, from 0 to 100, realist

Indexing problem

2005-05-30 Thread Falko Guderian
Hi, I indexed 20 documents. I want to evaluate my lucene index. That's why I extract all term with their frequencies in each document. This code has helped a lot. - try { TermEnum terms = indexReader.terms(new Term("content", ""));

Indexing problem

2005-05-30 Thread Falko Guderian
Hi, I indexed 20 documents. I want to evaluate my lucene index. That's why I extract all term with their frequencies in each document. This code has helped a lot. - try { TermEnum terms = indexReader.terms(new Term("content", ""));

RE: Indexing multiple keywords in one field?

2005-05-30 Thread Doug Hughes
Hoss, I see what you're saying, but that seems primarily beneficial when you know the structure and size of your data ahead of time. For instance, any of the HTML documents I'm indexing can have any number of links, from 0 to 100, realistically. If I place each of those links in separate keywo

RE: Indexing multiple keywords in one field?

2005-05-30 Thread Doug Hughes
Hoss, I see what you're saying, but that seems primarily beneficial when you know the structure and size of your data ahead of time. For instance, any of the HTML documents I'm indexing can have any number of links, from 0 to 100, realistically. If I place each of those links in separate keywo

Preserving original HTML file offsets for highlighting, need HTMLTokenizer?

2005-05-30 Thread Fred Toth
Hi all, Those of you who have read and responded to my recent posts know that we are working on highlighting the entire document after a search. (Not fragments in a results list.) It appears that one of the key tools to assist with this is the ability of Lucene to store file offsets of terms as

Stemming at Query time

2005-05-30 Thread Andrew Boyd
Hi All, Now that the QueryParser knows about position increments has anyone used this to do stemming at query time and not at indexing time? I suppose one would need a reverse stemmer. Given the query breath it would need to inject breathe, breathes, breathing etc. One benifit is that if you

Adding to the termFreqVector

2005-05-30 Thread Ryan Skow
How would one go about adding additional terms to a field which is not stored literally, but instead has a termFreqVector? For example: If DocumentA was indexed originally with: myTermField: red green blue termFreqVector would look like: freq {myTermField: red/1, green/1, blue/1} Now,

Re: ACLs and Lucene

2005-05-30 Thread Markus Wiederkehr
On 5/30/05, Robichaud, Jean-Philippe <[EMAIL PROTECTED]> wrote: > What about: > http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/luce > ne/index/ParallelReader.java?rev=169859&view=markup Thank you, this seems to be exactly what I am looking for. One thing I don't quiet und

RE: ACLs and Lucene

2005-05-30 Thread Robichaud, Jean-Philippe
What about: http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/luce ne/index/ParallelReader.java?rev=169859&view=markup Jp -Original Message- From: Bruce Ritchie [mailto:[EMAIL PROTECTED] Sent: Monday, May 30, 2005 11:26 AM To: java-user@lucene.apache.org Subject: RE:

RE: ACLs and Lucene

2005-05-30 Thread Bruce Ritchie
Markus, > I am working on a Document Management System where every > document has an Access Control List attached to it. Obviously > a search result should only consist of documents that may be > viewed by the currently logged in user. > > I can think of three strategies to accomplish this goa

Clustering Carrot2 vs TermVector Analysis

2005-05-30 Thread Andrew Boyd
Hi All, By using the carrot demo: http://www.newsarch.com/archive/mailinglist/jakarta/lucene/user/msg03928.html I was able to easliy cluster search results based on the fields used by carrot( url, title, and summary). However I was wondering if there was a way to do something similar using t

RE: ACLs and Lucene

2005-05-30 Thread Max Pfingsthorn
Hi! I've got exactly the same problem. Maybe it is possible to extend the previously discussed patch to fragment the fields of one document into separate files to actually allow updating only one fragment? Then, updating frequently changing fields (like ACLs or other meta data, maybe even a Pag

ACLs and Lucene

2005-05-30 Thread Markus Wiederkehr
I am working on a Document Management System where every document has an Access Control List attached to it. Obviously a search result should only consist of documents that may be viewed by the currently logged in user. I can think of three strategies to accomplish this goal: 1) using Filter and