Stefan Raspl/Germany/IBM is out of the office.

2006-06-02 Thread Stefan Raspl
I will be out of the office starting 06/03/2006 and will not return until 06/26/2006. I will respond to your message when I return. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Per-token weighting / attribute data in index

2006-06-02 Thread Marvin Humphrey
On Fri, Jun 02, 2006 at 03:47:10PM -0700, Chris Hostetter wrote: > You may want to check out the java-dev list ... there's been some talk > among the people who really unerstand the low levels of lucene's file > formats about adding arbitrary "payload" data with each term/doc pair .. a > proposal t

Re: Per-token weighting / attribute data in index

2006-06-02 Thread Scott Davies
Dang, that's what I was afraid of. Good to hear they're actively considering extensions that'd fix the issue, though. In the meantime I guess I'll try limping along without 'em. Thanks! -- Scott On 6/2/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : : A simple example would be indexing and

Re: Per-token weighting / attribute data in index

2006-06-02 Thread Chris Hostetter
: : A simple example would be indexing and scoring the hyperlink text from : other web pages that point to the page P that I'm indexing/scoring. I : might have some metric saying how much I "trust" each of the pages or : sites with hyperlinks to P, and want to use that metric to increase or Hmmm.

Re: Per-token weighting / attribute data in index

2006-06-02 Thread Scott Davies
A simple example would be indexing and scoring the hyperlink text from other web pages that point to the page P that I'm indexing/scoring. I might have some metric saying how much I "trust" each of the pages or sites with hyperlinks to P, and want to use that metric to increase or decrease how mu

Re: Per-token weighting / attribute data in index

2006-06-02 Thread Chris Hostetter
i may be missunderstanding your goal .. it sounds like what you want to do is say thta for certain documents (which you trust) matching on the title is "worth more" then matching on the title of other documents (which you don't trust) if that' the case, then at index time you can add field boost

Per-token weighting / attribute data in index

2006-06-02 Thread Scott Davies
Hi...reasonably experienced web search programmer but total Lucene newbie here. After poking through Lucene for a while, I still haven't figured out a decent way to tweak the scoring based on per-token data. For example, as far as I can tell so far, the only reasonable way to have words in the t

Re: Integrity of Lucene

2006-06-02 Thread Daniel Naber
On Freitag 02 Juni 2006 15:46, Dan Wiggin wrote: > Everytime that I do any add or delete elements in my Index I don't have > to close my searcher and reopen to update this. See IndexModifier for how to mix deletions and adds. But you still need to re-open your searchers. Regards Daniel -- ht

Re: Krishnendra Nandi is out of the office.

2006-06-02 Thread gekkokid
not again plz - Original Message - From: "Krishnendra Nandi" <[EMAIL PROTECTED]> To: Sent: Friday, June 02, 2006 2:34 PM Subject: Krishnendra Nandi is out of the office. Regarding your message: Re: Num of a term in a Doc I will be out of the office starting 01-Jun-2006 and will n

Re: Ontologies in Lucene???

2006-06-02 Thread adasal
gobe, do you subscribe to the SemWeb and ONTAC mailing lists? Reading these lists, and from other sources, it is apparent that there is a lot of interest in building out ontologies and ontological usages from the small and immiediate. There is much discussion of folksonomies and their impact. What

Integrity of Lucene

2006-06-02 Thread Dan Wiggin
I read about concurrency in Lucene but I'm not sure to understand well. I can't do operations of delete and add simultaniouslly.If I've a writer that I'm using to add new docs, I can't delete anything in Lucene index until I close my opened writer. Or perhaps Did not close my writer? Everytime tha

Krishnendra Nandi is out of the office.

2006-06-02 Thread Krishnendra Nandi
Regarding your message: Re: Num of a term in a Doc I will be out of the office starting 01-Jun-2006 and will not return until 04-Jun-2006. I will respond to your message when I return. The information contained in this e-mail and any accompanying documents may contain information that is

Re: Num of a term in a Doc

2006-06-02 Thread Grant Ingersoll
You can do this a couple of different ways (at least): 1. Use term vectors. See http://www.cnlp.org/apachecon2005 for an intro, search on this list, or look in Erik and Otis book Lucene In Action. This will be the fastest way, but will require more space in your index to store the term vector

Re: Problems with Lucene

2006-06-02 Thread Alberto Marquÿffffe9s
Impossible to make it work nor if it wants invoking to the SearchFiles class main of demo. Nobody to worked with JSF + Lucene, as soon as is a class of the IndexReader type to reader = IndexReader.open(index) any class of Lucene. >> Problems with Lucene executing from Web with jsf. I do no

Num of a term in a Doc

2006-06-02 Thread Mary S
Hi, I want to get the freq of a term in a Doc. public int termFreq( int docID, String termName ) { IndexReader reader = IndexReader.open(directory); Document doc = reader.document(docID); int FreqForTerm = doc ??? return FreqForTerm; } I didn't find what I want in the archives.

A question on caching filters and architecture

2006-06-02 Thread Marc Dauncey
Hi everyone, just wondered if I could get peoples opinions on a design issue. I am designing a system that uses broker / search server pattern. Each search server will be responsible for searching a particular application that will consist of multiple indexes. Initially, these "servers" will r