RE: Java Indexer + DotLucene + IIS question

2005-10-31 Thread Wesley MacDonald
Hi, I think it would. Wes. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: October 31, 2005 4:06 PM To: java-user@lucene.apache.org Subject: Re: Java Indexer + DotLucene + IIS question My mistake...he sent this: http://www.nsisoftware.com/ -Original M

Re: Lucene and Sax

2005-10-31 Thread MALCOLM CLARK
Karl, Thanks for your tips.I have considered DOM processing but it seemed to take a hell of a long time to process all the documents(12,125). Malcolm Clark

Re: Lucene and SAX

2005-10-31 Thread MALCOLM CLARK
Grant, Thanks for your tips.I have considered DOM processing but it seemed to take a hell of a long time to process all the documents(12,125).

Re: Lucene and Sax

2005-10-31 Thread MALCOLM CLARK
Grant, Thanks for your help with the problem I was experiencing. I split it all down and realised the problem was the location of the IndexWriting(It was not in the correct place within the SAX processing) and also becuase of some poor error handling on my part. kind thanks, Malcolm

Re: Java Indexer + DotLucene + IIS question

2005-10-31 Thread msftblows
My mistake...he sent this: http://www.nsisoftware.com/ -Original Message- From: [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Mon, 31 Oct 2005 15:59:27 -0500 Subject: Re: Java Indexer + DotLucene + IIS question My webmaster sent me this: http://www.lyonware.co.uk/DoubleTake/D

Re: Java Indexer + DotLucene + IIS question

2005-10-31 Thread msftblows
My webmaster sent me this: http://www.lyonware.co.uk/DoubleTake/DoubleTake.htm He has used this...can I assume this would work just as well? -Original Message- From: Wesley MacDonald <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Mon, 31 Oct 2005 15:50:46 -0500 Subject: RE:

RE: Java Indexer + DotLucene + IIS question

2005-10-31 Thread Wesley MacDonald
Hi, Replistor makes a byte to byte copy to the slave machines from the Main PC. We use it to replicate a SQLServer Database to a standby PC (it does a byte to byte copy), it also sync's our web application changes across the web farm. http://www.legato.com/products/replistor/ Wes. -Origi

Re: complex search

2005-10-31 Thread Volodymyr Bychkoviak
thanks for idea... Chris Hostetter wrote: : I want to implement search which in SQL equivalent looks like : select itemId, min(price) from : where : groupBy itemId : Is it possible to achieve? Not easily. The most straightforward approach I can think of is to write your own HitColl

Re: Indexing dates

2005-10-31 Thread Erik Hatcher
Looks like another strange classpath issue. There most certainly is a method with that signature: $ javap -classpath lucene-1.4.3.jar org.apache.lucene.document.Field Compiled from "Field.java" public final class org.apache.lucene.document.Field extends java.lang.Object implements java.io.S

Re: Java Indexer + DotLucene + IIS question

2005-10-31 Thread msftblows
Never seen this product...but in my case I would have the indexer running on one webserver and the second it creates a file, Replistor will take care of getting to all the other machines...but will there be any issues with compressing etc.? I have come across where .cfs files were missing etc.

RE: Java Indexer + DotLucene + IIS question

2005-10-31 Thread Wesley MacDonald
Hi, You might want to use Replistor in a case like this and only have one indexer running, let Replistor manage the copies. Wes. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: October 31, 2005 2:28 PM To: java-user@lucene.apache.org Subject: Java Inde

Re: Indexing

2005-10-31 Thread Chris Hostetter
: I've 4 fields in a document ie. id, URL, modified date, contents. id is : unique for each document. I wanted to know if I index a document with : the same id again , will the previous document (in the index) be : overwritten or do I have to delete the index for that document first and : then re

Re: complex search

2005-10-31 Thread Chris Hostetter
: I want to implement search which in SQL equivalent looks like : select itemId, min(price) from : where : groupBy itemId : Is it possible to achieve? Not easily. The most straightforward approach I can think of is to write your own HitCollector that builds up a Hash of itemId => pri

Re: Multiple Analyzers

2005-10-31 Thread Chris Hostetter
: understand your recommendations. Is there a way that I can just : incorporate Stem, Soundex, and Standard into one search. In other words, : don't toggle anything. Just index using custom analyzer that contains : Stem, Soundex, and Standard analyzers at once. And search using the custom : an

Java Indexer + DotLucene + IIS question

2005-10-31 Thread msftblows
Hey- I have the following situation, and I am looking for any suggestions... First, here is my current configuration: 1. Java Indexer (windows service) created to index data from a SQL Server database...3 indexes are created 2. DotLucene is used on the front-end to search my index files...

Indexing dates

2005-10-31 Thread anushri kumar
Hi, I was trying to index dates. I wrote Document doc = new Document(); doc.add(Field.Keyword("indexdate",new Date() )); but while running the program it gave me the following error. Exception in thread "main" java.lang.NoSuchMethodError: org.apache.lucene.docume nt.Field.Keyword(Ljava/lang/St

RE: Indexing

2005-10-31 Thread Peter Kim
You need to delete the document from the index and reindex it. This is in the LuceneFAQ: http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-917dd4fc904aa20a34e bd23eb321125bdca1dea2 (or #24 under 3. Indexing) Peter > -Original Message- > From: anushri kumar [mailto:[EMAIL PROTECTED]

Re: Indexing

2005-10-31 Thread Volodymyr Bychkoviak
you have to delete doc with id and re-add it. if index is updated continuously then you should batch your updates (first delete all old docs and then add new docs) anushri kumar wrote: Hi, I've 4 fields in a document ie. id, URL, modified date, contents. id is unique for each document. I w

Indexing

2005-10-31 Thread anushri kumar
Hi, I've 4 fields in a document ie. id, URL, modified date, contents. id is unique for each document. I wanted to know if I index a document with the same id again , will the previous document (in the index) be overwritten or do I have to delete the index for that document first and then re in

RE: Help requested

2005-10-31 Thread Peter Kim
I just wanted to clarify... I don't believe the following statement is accurate: > > The "contents" field searches on the entire document, > including all indexes. There is no default field named "contents" that automatically combines the contents of all your fields. As Erik mentioned, you nee

Re: Multiple Analyzers

2005-10-31 Thread Erik Hatcher
Yes, you can certainly do all three in one shot. Look at the source code to StandardAnalyzer, build a custom analyzer that used it's core tokenization, then filtered through the SnowballFilter, and then a custom soundex filter (or metaphone or similar). That would make for some mighty fu

Re: List of removed stop words?

2005-10-31 Thread jian chen
Hi, In case you are using StandardAnalyzer, there is a stop word list. I have used StandardAnalyzer.STOP_WORDS, which is a String[]. Cheers, Jian On 10/31/05, Rob Young <[EMAIL PROTECTED]> wrote: > > Hi, > > Is there an easy way to list stop words that were removed from a string? > I'm using th

Re: Multiple Analyzers

2005-10-31 Thread Daniel . Clark
Thanks Erik. So you're saying that my approach won't work, right? I understand your recommendations. Is there a way that I can just incorporate Stem, Soundex, and Standard into one search. In other words, don't toggle anything. Just index using custom analyzer that contains Stem, Soundex, and

Re: Sentence boundary storage

2005-10-31 Thread Grant Ingersoll
Inline below Chris Hostetter wrote: : Actually, I was thinking of writing something along the lines of : Span*BoundaryQuery where it would be more explicit than what was : described below. You could say SpanSentence and say you want the terms I'm not clear on how such a SpanSentence class wou

Re: List of removed stop words?

2005-10-31 Thread Erik Hatcher
On 31 Oct 2005, at 07:02, Rob Young wrote: Is there an easy way to list stop words that were removed from a string? I'm using the standard analyzer on user's searchstrings and I would like to let them know when stop words have been removed (ala google). Any ideas? Nothing automatic with t

List of removed stop words?

2005-10-31 Thread Rob Young
Hi, Is there an easy way to list stop words that were removed from a string? I'm using the standard analyzer on user's searchstrings and I would like to let them know when stop words have been removed (ala google). Any ideas? Cheers Rob ---

complex search

2005-10-31 Thread Volodymyr Bychkoviak
hi all. I have indexed table from database into index and it looks like: itemId is not unique. I want to implement search which in SQL equivalent looks like select itemId, min(price) from where groupBy itemId Is it possible to achieve? -- regards, Volodymyr Bychkoviak --

Re: StandardTokenizer throws extra exceptions

2005-10-31 Thread Rob Young
Roxana Angheluta wrote: I had the same problem. I solved it by manually editing the file ParseException.java every time when modifying .jj file: import java.io.*; public class ParseException extends IOException { It's not the most elegant way to do it, I'm also interested in a more scalable

Re: StandardTokenizer throws extra exceptions

2005-10-31 Thread Roxana Angheluta
Rob Young wrote: Hi, I'm trying to create another, slightly changed, version of StandardAnalyzer. I've coppied out the source, editted the .jj file and re-built the StandardTokenizer class. The problem I am facing is, when I have all this in eclipse it's telling me that the ParseException is

StandardTokenizer throws extra exceptions

2005-10-31 Thread Rob Young
Hi, I'm trying to create another, slightly changed, version of StandardAnalyzer. I've coppied out the source, editted the .jj file and re-built the StandardTokenizer class. The problem I am facing is, when I have all this in eclipse it's telling me that the ParseException is not compatible wi

Re: Lucene and SAX

2005-10-31 Thread Karl Øie
Hi there Malcolm! I can´t see any place in your source that you add the document id of the document you are parsing. startDocument() should atleast add a sys-id field for the xml document being parsed; public void startDocument() { mDocument = new Document(); mDocument.add(new Field(

Re: Sentence boundary storage

2005-10-31 Thread Chris Hostetter
: Actually, I was thinking of writing something along the lines of : Span*BoundaryQuery where it would be more explicit than what was : described below. You could say SpanSentence and say you want the terms I'm not clear on how such a SpanSentence class would work -- the index must contain info