Re: Replicating Lucene Index with out SOLR

2008-08-27 Thread rahul_k123
Can i make use of solr scripts for this purpose. The snapinstaller runs on the slave after a snapshot has been pulled from the master. This signals the local Solr server to open a new index reader, then auto-warming of the cache(s) begins (in the new reader), while other requests continue to be

Re: Re: Replicating Lucene Index with out SOLR

2008-08-27 Thread Chris Hostetter
This is the 30th or so message sent to the java-user mailing list by your Out Of Office Auto-Reply system. I suspect at least 30 more will be sent before you get back to the office on Sept 2nd. Please temporarily unsubscribe yourself from Lucene mailing lists the next time you go on vacation

Re: Re: Re: Replicating Lucene Index with out SOLR

2008-08-27 Thread tom
AUTOMATIC REPLY Tom Roberts is out of the office till 2nd September 2008. LUX reopens on 1st September 2008 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Re: Replicating Lucene Index with out SOLR

2008-08-27 Thread tom
AUTOMATIC REPLY Tom Roberts is out of the office till 2nd September 2008. LUX reopens on 1st September 2008 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Replicating Lucene Index with out SOLR

2008-08-27 Thread Otis Gospodnetic
You don't need to copy the whole index every time if you do incremental indexing/updates and don't optimize the index before copying. If you use rsync for copying the index, only the new/modified files be copied. This is what Solr replication scripts do, too. Otis -- Sematext -- http://semate

Re: Replicating Lucene Index with out SOLR

2008-08-27 Thread Kent Fitch
Check out this recipe for using rsync by Doug Cutting: http://www.mail-archive.com/[EMAIL PROTECTED]/msg12709.html Kent Fitch On Thu, Aug 28, 2008 at 1:38 PM, rahul_k123 <[EMAIL PROTECTED]> wrote: > > I have the following requirement > > Right now we have multiple indexes serving our web applica

Replicating Lucene Index with out SOLR

2008-08-27 Thread rahul_k123
I have the following requirement Right now we have multiple indexes serving our web application. Our indexes are around 30 GB size. We want to replicate the index data so that we can use them to distribute the search load. This is what we need ideally. A – (supports writes and reads) A1 –Rep

Re: Combining Wildcard and Term Queries?

2008-08-27 Thread Chris Hostetter
: > That sounds like what I'm after - but how do I get hold of the : > IndexReader so I can call IndexReader.terms(Term) ? : > The code where I am doing this work is getFieldQuery(String field, : > String queryText) of my custom query parser ... : : QueryParser indeed doesn't know about IndexSear

Re: Lucene sample code and api documentation

2008-08-27 Thread Otis Gospodnetic
Sithu, Old emails: markmail.org Sample code: Lucene in Action has free downloadable code -- manning.com/hatcher2 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: "Sudarsan, Sithu D." <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Se

Re: Replicating Lucene Index with out SOLR

2008-08-27 Thread Otis Gospodnetic
Hi, You may want to ask on the java-user list (more subscribers), which I'm CC-ing, so we can continue discussion there. I think you will have to implement your own logic that runs on A and does something like this: - stop adding new docs - call commit on the IndexWriter - copy the index - res

Re: Can TermDocs.skipTo() go backwards

2008-08-27 Thread Antony Bowesman
Michael McCandless wrote: Ahh right, my short term memory failed me ;) I now remember this thread. Excused :) I expect you have real work to occupy your mind! Yes, though LUCENE-1231 (column stride stored fields) should help this. I see from JIRA that MB has started working on this - It's

Clarity: Is there a Query boosting 50-50 over 1000-1 ?

2008-08-27 Thread Shi Hui Liu
Hi, I think I should clarify my question a little bit. I'm using BooleanQuery to combine TermQuery(A) and TermQuery(B). But I'm not satisfied with its scoring algorigthm. Is there other queries can boost up the documents with 50 of A and 50 of B on top of documents with 1000 of A and 1 of B? An

Re: Can TermDocs.skipTo() go backwards

2008-08-27 Thread Michael McCandless
Antony Bowesman wrote: Michael McCandless wrote: TermDocs.skipTo() only moves forwards. Can you use a stored field to retrieve this information, or do you really need to store it per-term-occurrence in your docs? I discussed my use case with Doron earlier and there were two options, eith

Re: Case Sensitivity

2008-08-27 Thread Michael McCandless
OK I'll open an issue to do this renaming in 3.0, which actually means we do the renaming in 2.4 or 2.9 (deprecating the old ones) then in 3.0 removing the old ones. Mike On Aug 27, 2008, at 11:08 AM, Otis Gospodnetic wrote: Nah, I think the names are fine, I simply forgot. I looked at t

We need Java Developer with Lucene experience in SSFO, CA

2008-08-27 Thread Neetu
Contractor position in SSF (South San Francisco) _ We are looking for excellent Java developer who loves solving difficult problems. The Project itself is innovative and fun. Required Qualifications: - At least 5 years in J2EE development, in real demo-able projects - Computer Scien

Re: Can TermDocs.skipTo() go backwards

2008-08-27 Thread Antony Bowesman
Michael McCandless wrote: TermDocs.skipTo() only moves forwards. Can you use a stored field to retrieve this information, or do you really need to store it per-term-occurrence in your docs? I discussed my use case with Doron earlier and there were two options, either to use payloads or stor

Re: lucene 3.0 feature list?

2008-08-27 Thread Grant Ingersoll
See http://wiki.apache.org/lucene-java/BackwardsCompatibility Generally speaking 3.0 will be the same as 2.9 minus the deprecated methods (and, in this case, the upgrade to JDK 1.5). That is not to say that the file formats, etc. under the hood won't change and that there won't be new meth

Re: lucene 3.0 feature list?

2008-08-27 Thread Darren Govoni
I understand that the API will be 2.4 compatible. That's not really a feature. I can go hunt through JIRA, but was wondering if there is a clean bulleted list of 'Hey! Here's what's great about 3.0'. Things like maybe, performance related, new analyzers, query semantics, searchers, writers, etc. Ju

Lucene sample code and api documentation

2008-08-27 Thread Sudarsan, Sithu D.
Hi All, I'm new to Lucene. 1. Could you please tell me as to where do we see the old emails (even one day old), not as an archived file but as a mailing list. 2. Where do we look for sample codes? Or detailed tutorials? 3. I found one at LuceneTutorial.com, but it is only for command line. Not

Re: lucene 3.0 feature list?

2008-08-27 Thread Andre Rubin
So, you mean you're gonna be removing the deprecated methods from the api? Andre On Tue, Aug 26, 2008 at 3:59 PM, Karl Wettin <[EMAIL PROTECTED]> wrote: > > 27 aug 2008 kl. 00.52 skrev Darren Govoni: > > Hi, >> Sorry if I missed this somewhere or maybe its not released yet, but I >> was anxiou

Is there a Query boosting 50-50 over 1000-1 ?

2008-08-27 Thread Shi Hui Liu
Hi, Say, I have a query with two terms: A + B, I want to return the documents with 50 of A and 50 of B on top of documents with 1000 of A and 1 of B. Is there an existing Query class can handle this case or I have to implement a new Query? Thank you, Shi Hui

Re: Case Sensitivity

2008-08-27 Thread Otis Gospodnetic
Nah, I think the names are fine, I simply forgot. I looked at the javadocs, it clearly says NO_NORMS doesn't get passed through an Analyzer. Maybe in 3.0 we can switch to NOT_ANALYZED, as suggested, to reflect reality more closely. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - N

Re: lucene 3.0 feature list?

2008-08-27 Thread Grant Ingersoll
On Aug 26, 2008, at 6:59 PM, Karl Wettin wrote: 27 aug 2008 kl. 00.52 skrev Darren Govoni: Hi, Sorry if I missed this somewhere or maybe its not released yet, but I was anxiously curious about lucene 3.0's expected features/ improvements. Is there a list yet? If everything goes as planne

Re: Case Sensitivity

2008-08-27 Thread Michael McCandless
Or ... split the two notions apart so that you have Field.Index. [UN_]ANALYZED and, separately, Field.Index.[NO_]NORMS which could then be combined together in all 4 combinations (we'd have to fix the Parameter class to let you build up a new Parameter by combining existing ones...). I t

Re: Index types

2008-08-27 Thread John Patterson
It is a blurry line between the need to use a DBMS and lucene. For now Lucene works fine for my search needs and I use Terracotta to persist application state. So no need for a DBMS at all currently - although under the hood Terracotta uses BDB JE. Does Solr's range impementation use the large

Re: Index types

2008-08-27 Thread Karsten F.
Hi John, about "integration other index implementation": Sounds like you need a DBMS with some lucene features. There was a post about using lucene in Oracle: http://www.nabble.com/Using-lucene-as-a-database...-good-idea-or-bad-idea--to18703473.html#a18741137 and http://www.nabble.com/Oracle-and-

Re: Case Sensitivity

2008-08-27 Thread Daniel Naber
On Mittwoch, 27. August 2008, Michael McCandless wrote: > Probably we should rename it to Field.Index.UN_TOKENiZED_NO_NORMS? I think it's enough if the api doc explains it, no need to rename it. What's more confusing is that (UN_)TOKENIZED should actually be called (UN_)ANALYZED IMHO. Regards

RE: Case Sensitivity

2008-08-27 Thread Dino Korah
Thanks Otis & Mike. Probably we should keep it the way it is now. Would be better to include more information on the various combinations of these options and its effect on the final result (set of terms that get to the index). Would be nicer if we could mention the search scenario as well. To be

Re: Index types

2008-08-27 Thread Karl Wettin
27 aug 2008 kl. 11.11 skrev John Patterson: Hi, I know that Lucene uses an inverted index which makes range queries and great-than/less-than type queries very slow for continuous data types like times, latitude, etc. Last time I looked they were converted into huge OR queries and so ha

Re: Upgrading from v2.2.0 to v2.3.2

2008-08-27 Thread Michael McCandless
Mark Lassau wrote: Mike, Thanks for the prompt response. Michael McCandless wrote: Mark Lassau wrote: I am a developer on the JIRA Issue tracker, and we are considering upgrading our Lucene version from v2.2.0 to v2.3.2. I have been charged with doing the risk analysis, and project work

Re: Can TermDocs.skipTo() go backwards

2008-08-27 Thread Michael McCandless
TermDocs.skipTo() only moves forwards. Can you use a stored field to retrieve this information, or do you really need to store it per-term-occurrence in your docs? Mike Antony Bowesman wrote: I have a custom TopDocsCollector and need to collect a payload from each final document hit. Th

Re: Case Sensitivity

2008-08-27 Thread Michael McCandless
Actually, as confusing as it is, Field.Index.NO_NORMS means Field.Index.UN_TOKENIZED plus field.setOmitNorms(true). Probably we should rename it to Field.Index.UN_TOKENiZED_NO_NORMS? Mike Otis Gospodnetic wrote: Dino, you lost me half-way through your email :( NO_NORMS does not mean the

Index types

2008-08-27 Thread John Patterson
Hi, I know that Lucene uses an inverted index which makes range queries and great-than/less-than type queries very slow for continuous data types like times, latitude, etc. Last time I looked they were converted into huge OR queries and so had a maximum clause limit. I was wondering if any wor

Can TermDocs.skipTo() go backwards

2008-08-27 Thread Antony Bowesman
I have a custom TopDocsCollector and need to collect a payload from each final document hit. The payload comes from a single term in each hit. When collecting the payload, I don't want to fetch the payload during the collect() method as it will make fetches which may subsequently be bumped fro