RE: Clustering Lucene with 40 Servers

2006-12-27 Thread Adam Fleming
Hello, I saw that Doug Cutting had an interesting solution for his Technorati website: http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg12709.html It sounds like it's a single-writer, many readers type of system, but quite robust and efficient. Cheers, Adam ---

Re: Paging Lucene Results

2006-12-27 Thread Erik Hatcher
On Dec 28, 2006, at 12:02 AM, Peter W. wrote: I'm trying to iterate or page through Lucene document hits results. Before reinventing this, is there an existing solution out there or in Solr? There really isn't much wheel to reinvent... you can "page" through Hits by simply starting at any

Re: Nested Queries

2006-12-27 Thread Kapil Chhabra
Hi Steve, Thanks for the response. Actually I am not looking for a query language. My question is, whether Lucene supports Nested Queries or self joins? As per http://lucene.apache.org/java/docs/api/org/apache/lucene/queryParser/QueryParser.html In BNF, the query grammar is: Query ::= ( Cl

Paging Lucene Results

2006-12-27 Thread Peter W .
Hello, I'm trying to iterate or page through Lucene document hits results. Before reinventing this, is there an existing solution out there or in Solr? Thanks in advance, Peter - To unsubscribe, e-mail: [EMAIL PROTECTED] Fo

Re: Clustering Lucene with 40 Servers

2006-12-27 Thread Steve Harris
Some quick questions/points: What is the update rate? The number of nodes you described is no problem, the query rate would be no problem too (because they use read locks and act independently). Do all nodes do updates or just 1? How often do these updates occur? Probably best thing to do is g

Re: Clustering Lucene with 40 Servers

2006-12-27 Thread Chris Lu
Simply using NAS as just another file directory will cause these locks. You need to use your own logic to control when to re-open the index reader. I think you can look into Nutch's distributed file system to see whether that can help. -- Chris Lu - Instant Full-Text Searc

Re: Clustering Lucene with 40 Servers

2006-12-27 Thread Levent Bayindir
I am new to this too. But my plan is to use sth like this: I will use and online and offline index. Offline index will be presented to search engine users and offline index will be updated continuously. Time to time offline index will be written over online index. (When update is considered to

Re: Modelling Relational Lucene Index

2006-12-27 Thread Erick Erickson
One other note. If you do NOT store the article text, you can still search it but your index size for storing the text data will be MUCH smaller. This requires that you have access to the actual text somewhere in order to be able to return it to the user, but it's a possibility. The scenario runs

Re: help finding docs, creating analyzer objects

2006-12-27 Thread Grant Ingersoll
On Dec 26, 2006, at 11:57 PM, Erik Hatcher wrote: A definition of vocabulary fits perfectly on the wiki. I started http://wiki.apache.org/jakarta-lucene/ ConceptsAndDefinitions which is linked to from the main Wiki page. I started w/ the basics that Eric asked about and are very brief a

Re: Nested Queries

2006-12-27 Thread Steven Rowe
Hi Kapil, Kapil Chhabra wrote: > Just to mention, I have tokenized FIELD2 on "," and indexed it. > > FIELD2:3 should return 1,2 > FIELD2:(FIELD2:3) should return something like the output of: > > *FIELD2: 1 OR FIELD2: 2 Given your data table, I assume you mean: FIELD1:3 should return 1,2

RE: Clustering Lucene with 40 Servers

2006-12-27 Thread Biggy
Well try having say 30 servers try to write in the index at the same time and 10 others to read. You'll get enough locks to make a grown man cry. :) Scott Sellman wrote: > > Sorry if this seems naïve (I am new to Lucene), but why not keep one copy > of the Lucene index on a NAS and have it sha

RE: Clustering Lucene with 40 Servers

2006-12-27 Thread Scott Sellman
Sorry if this seems naïve (I am new to Lucene), but why not keep one copy of the Lucene index on a NAS and have it shared by all servers? -Original Message- From: Biggy [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 27, 2006 7:57 AM To: java-user@lucene.apache.org Subject: Cluster

Modelling Relational Lucene Index

2006-12-27 Thread Harini Raghavan
Hi Erick, Thank you for the detailed response. First I would like to mention that my application has an index with company id & name indexed for article for the following reasons: 1. A search interface where we search across articles and companies. 2. Paging - I need to page the results after

Re: toomanyclauses exception

2006-12-27 Thread Paul Elschot
On Wednesday 27 December 2006 16:53, Erick Erickson wrote: ... > 3> Look over the SrndQuery classes. I don't fully understand these, but they > certainly behave much differently in this area. Note that SrndQuery limits > wildcards to having at least three non-wildcard characters. In Lucene, the li

Clustering Lucene with 40 Servers

2006-12-27 Thread Biggy
I'm currently investigating the best ways of clustering Lucene. I've heard of both Solr, Terracotta but do not know how well they scale. Their examples talk of a 4 node cluster. This is way too small for my needs. I have 30x JVMs each handling 3 requests/sec and each having their own Lucene index

Re: toomanyclauses exception

2006-12-27 Thread Erick Erickson
Also, see the thread on this list titled "I just don't get wildcards at all" to see an extensive discussion of this issue, as well as wildcards in general. You might also search the archive for wildcards. The short form is that any wildcard (including prefix queries) expands under the covers to cr

Re: toomanyclauses exception

2006-12-27 Thread Paul Elschot
Chris, On Wednesday 27 December 2006 15:42, Chris Salem wrote: > Hi All, > > I'm getting a 'TooManyClauses' Exception and I'm not sure how to fix this. Here's a sample query that I'm using: > > +(+freeform_text:exhibit* +(+freeform_text:dispaly +freeform_text:event*) +(+freeform_text:sale* +f

toomanyclauses exception

2006-12-27 Thread Chris Salem
Hi All, I'm getting a 'TooManyClauses' Exception and I'm not sure how to fix this. Here's a sample query that I'm using: +(+freeform_text:exhibit* +(+freeform_text:dispaly +freeform_text:event*) +(+freeform_text:sale* +freeform_text:sells +freeform_text:develop*) +(+freeform_text:trade +freef

Re: Nested Queries

2006-12-27 Thread Grant Ingersoll
Hi Kapil, I am not sure exactly what you asking, could you give an example of the correct response? Also, are you truly using numbers or are they just substitutes for text? And are they part of a bigger problem requiring Lucene? If it is just numbers, maybe a DB might be the better way