Re: Lucene 2.2, NFS, Lock obtain timed out

2007-07-04 Thread Patrick Kimber
Hi Michael Yes, there are many lines in the logs saying: hit FileNotFoundException when loading commit "segment_X"; skipping this commit point ...so it looks like the new code is working perfectly. I am sorry to be vague... but how do I check which segments file is opened when a new writer is cr

Lucene Indexing and searching - help

2007-07-04 Thread emmettwalsh
Hi, : : lucene documentation seems to be very confusing... here is my predicament : : I have an object like the following: : : public class PropertyImpl implements Property { : : String id; : : List names = new ArrayList(); : : String address = ""; : : String city = ""; : : String street = ""; : :

problems with deleteDocuments

2007-07-04 Thread Nick Johnson
I'm having several problems with deleting documents with Lucene 2.2. Via the IndexWriter, I can successfully delete a document by its primary key via a Term, but ONLY if the field was stored as Field.Index.UN_TOKENIZED. If it was stored as TOKENIZED, the debug output says it is deleting the do

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-07-04 Thread Michael McCandless
"Patrick Kimber" <[EMAIL PROTECTED]> wrote: > Yes, there are many lines in the logs saying: > hit FileNotFoundException when loading commit "segment_X"; skipping > this commit point > ...so it looks like the new code is working perfectly. Super! > I am sorry to be vague... but how do I check wh

Re: Modify search results

2007-07-04 Thread Robert Mullin
Hoss, Thanks for your reply. You are correct. I am working with the Lucene Demo and trying to get some traction. But without much luck. Most postings on the list are way beyond me. I continue to research the literature in order to find something that will bring me gently forward so that I ca

Index partitioning by term

2007-07-04 Thread Ndapa Nakashole
I am considering using Lucene in my mini Grid-based search engine. I would like to partition my index by term as opposed to partition by document. From what i have read in the mailing list so far, it seems like partition by term is impossible with Lucene. am i right to conclude this! I know Nutch

Re: Pagination

2007-07-04 Thread markharw00d
It looks that we may have different cases. I was hoping to answer the original question which was how to retrieve pages of matching documents from a Lucene index (no database mentioned). >>So far worked just fine. I have 5000 rows of items and I think will still work fine later when I'd have

Re: problems with deleteDocuments

2007-07-04 Thread Yonik Seeley
Nick, are you opening a new IndexSearcher after you close the IndexWriter? -Yonik On 7/4/07, Nick Johnson <[EMAIL PROTECTED]> wrote: I'm having several problems with deleting documents with Lucene 2.2. Via the IndexWriter, I can successfully delete a document by its primary key via a Term, but

Re: Lucene Indexing and searching - help

2007-07-04 Thread Erick Erickson
A couple of things would help us help you 1> tell us what you're trying to do. What's the point of your code? Offhand, I can't tell what it is you're really after. 2> post an example of query.toString(); along with your sample for one of the offending queries. 3> Post the query stri

Re: Index partitioning by term

2007-07-04 Thread Mathieu Lecarme
Ndapa Nakashole a écrit : > I am considering using Lucene in my mini Grid-based search engine. I > would > like to partition my index by term as opposed to partition by > document. From > what i have read in the mailing list so far, it seems like partition > by term > is impossible with Lucene. am

Re: Modify search results

2007-07-04 Thread Erick Erickson
First, get Luke (google lucene, luke). Use it to open the index created by the demo (I confess I don't know if the index is a RAMdir or FSDIR. if it's a RAMdir, find the code in the demo that opens it and change it to an FSDir). This is important as it'll give you a clue about the structure of an

Re: problems with deleteDocuments

2007-07-04 Thread Erick Erickson
This is exactly the behavior I'd expect. Consider what would happen otherwise. Say you have documents with the following values for a field (call it blah). some data some data I put in the index lots of data data Then I don't want deleting on the term blah:data to remove all of them. Which seems

Re: problems with deleteDocuments

2007-07-04 Thread Nick Johnson
I am. On Wed, 4 Jul 2007, Yonik Seeley wrote: > Nick, are you opening a new IndexSearcher after you close the IndexWriter? -- "Courage isn't just a matter of not being frightened, you know. It's being afraid and doing what you have to do anyway." Doctor Who - Planet of the Daleks This messa

Re: problems with deleteDocuments

2007-07-04 Thread Nick Johnson
I think I follow you. I don't have a problem with storing something like a primary key as UN_TOKENIZED, though I'm a bit baffled about why it didn't work as TOKENIZED, since the _only_ thing in that field is the value of the primary key (ie, the string value of some integer). It seems like it

Re: problems with deleteDocuments

2007-07-04 Thread Erick Erickson
See below On 7/4/07, Nick Johnson <[EMAIL PROTECTED]> wrote: I think I follow you. I don't have a problem with storing something like a primary key as UN_TOKENIZED, though I'm a bit baffled about why it didn't work as TOKENIZED, since the _only_ thing in that field is the value of the primary

Re: problems with deleteDocuments

2007-07-04 Thread Nick Johnson
A little more digging and I found the problem (amazing what coffee can do). It was a bad assertion in my unit test. Basically I was checking to see that the article was indexed after the update, but didn't check to see whether it was indexed BEFORE the update. It wasn't. Or rather, it was,

Re: product based term combination for BooleanQuery?

2007-07-04 Thread Tim Sturge
:-) The use of wikipedia data here is no secret; it's all over www.freebase.com. I just hoped to avoid being sucked into a "what is the best way to index wikipedia with Lucene?" discussion, which I believe several other groups are already tackling. At index time, I used a per document boost (o

Re: product based term combination for BooleanQuery?

2007-07-04 Thread Erick Erickson
It looks like you may already be aware of this, but if not, this is something Chris H. posted quite a while ago that I found useful <<>> Erick On 7/4/07, Tim Sturge <[EMAIL PROTECTED]> wrote: :-) The use of wikipedia data here is no secret; it's all over www.freebase.com. I just hoped to

Re: product based term combination for BooleanQuery?

2007-07-04 Thread Mike Klaas
On 3-Jul-07, at 4:43 PM, Tim Sturge wrote: Here's the explain output I currently get for "George Bush" "George W Bush", "John Kerry" "John Denver" and "John Bush". (there are others in between, but they follow very much the same pattern; an enormous score for one of "John" or "Bush" and a v

Re: Index partitioning by term

2007-07-04 Thread Mike Klaas
On 4-Jul-07, at 5:31 AM, Ndapa Nakashole wrote: I am considering using Lucene in my mini Grid-based search engine. I would like to partition my index by term as opposed to partition by document. From what i have read in the mailing list so far, it seems like partition by term is impossible