Re: merge factor and real time indexing

2006-05-30 Thread Otis Gospodnetic
Erick, Otis: What are the rules for reading an index that's being modified? Here's the sequence that I'm unclear about. open reader0 open writer add document 1 ***at this point, am I right that document1 is not visible to reader0? OG: correct. close reader0 open reader1 ***Now, is document 1 vis

Re: merge factor and real time indexing

2006-05-30 Thread Erick Erickson
John: I realize this is test code, but is it really measuring something meaningful? If you really need to open/modify/close for each document, then I've got to assume that the rate they're coming in is slow enough that the time it takes for this operation is not very important. And if they *are*

Re: merge factor and real time indexing

2006-05-30 Thread John Wang
Hi Otis: Thanks for your reply. The problem is the setting of maxBufferedDocs (see the code snippet from my previous email). I wouldn't think it would affect things but it does. -John On 5/30/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: John, I can't spot anything wrong in the code.

RE: Changing the scoring (newest doc date first)

2006-05-30 Thread Halsey, Stephen
Hi, I'm interested in getting a date ordered search on a very large index too, as we are having some scaling issues with the Sort object and its regeneration, and so was interested in your question and the answers above. Aviran mentioned using a boost in the query to get a rough sort on dates

Re: Lucene search optimization

2006-05-30 Thread Chris Hostetter
: Fuzzy searching against this property takes around 3 seconds, which is : way too much for what I plan to do, so I am considering the possible whenever anyone has a question about how to speed up a search, and the current amount of time the search takes is more then a second, there are a few que

Re: Lucene search optimization

2006-05-30 Thread Sami Dalouche
Hi, I didn't want to bother you with the exact details of my document, but since you're asking.. :-) So, I have the list of all world cities, and would like to let the users search for their city, allowing them to do small mistakes. Additionnally, since cities have sometimes different names, spel

Document design, analyzer questions?

2006-05-30 Thread Michael J. Prichard
Hello, I am working on a system that will index emails and their attachments. I have all the pieces working that parse the documents and I am now working on the actual indexing part. I would like to have synonym searching as well. Question is two fold. One, here is the layout I was thinki

Re: Can we search data stored in oracle database ??

2006-05-30 Thread Alexey Sorokin
http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-109358021acbfc89456e446740dc2bbf9049950f 2006/5/30, Amaresh Kumar Yadav <[EMAIL PROTECTED]>: Hi All, Can we search data which is stored in oracle database, by lucene search engine anybody has idea ?? Infact i have s

Re: Enforcing Primary key uniqueness in lucene index

2006-05-30 Thread Karel Tejnora
You can use jdbm.sf.net for holding your_id to lucene_id relation in a transaction hashtable on the disk. Also Yonik will say solr at incubator.apache.org/solr has this constraint check implemented. - To unsubscribe, e-mail:

Enforcing Primary key uniqueness in lucene index

2006-05-30 Thread Prasenjit Mukherjee
I want to enforce the concept of a unique primary key in lucene index by having a field whose values has to be unique for all lucene documents. One way is to do a search just before indexing, but that seems to consume lot of time as you have to create a new IndexSearcher every time you want to

Can we search data stored in oracle database ??

2006-05-30 Thread Amaresh Kumar Yadav
Hi All, Can we search data which is stored in oracle database, by lucene search engine anybody has idea ?? Infact i have some data which is stored in a table on oracle database, presently i search the data through query..Now i want to optimize it by lucene search engine.

RE: return document name as null Please help

2006-05-30 Thread Amaresh Kumar Yadav
Thanks Pasha, I got success. Regards.. Amaresh -Original Message- From: Pasha Bizhan [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 30, 2006 6:47 PM To: java-user@lucene.apache.org Subject: RE: return document name as null Please help Hi, > From: Amaresh Kumar Yadav [mailto:[EMAIL PRO

Re: Lucene search optimization

2006-05-30 Thread mark harwood
Take a look at "FuzzyLikeThisQuery" in contrib\queries. I use it for name searches on large indexes. Unlike FuzzyQuery it: a) limits the number of query terms produced b) provides better ranking (disables idf factor which otherwise boosts rare misspellings) The cost of running a query is strongl

Re: Maybe a bug of lucene 1.9

2006-05-30 Thread Erik Hatcher
On May 29, 2006, at 6:34 AM, hu andy wrote: I indexed a collection of Chinese documents. I use a special segmentation api to do the analysis, because the segmentation of Chinese is different from English. I'll second Otis' request about the special segmentation api. If it is open source

Re: Lucene search optimization

2006-05-30 Thread Erik Hatcher
Sami, You're on to the right approach seeking something other than FuzzyQuery. FuzzyQuery is rarely generally useful and there are other ways to achieve the same sort of thing (soundex, metaphone) in an efficient manner. If you could share some details about these properties and how you

Lucene search optimization

2006-05-30 Thread Sami Dalouche
Hi, I have 2 million documents, with a name property. (~15 to 20 characters). Fuzzy searching against this property takes around 3 seconds, which is way too much for what I plan to do, so I am considering the possible optimizations. I can add a property to each of the documents, that could partiti

Re: Maybe a bug of lucene 1.9

2006-05-30 Thread Otis Gospodnetic
Andy, Since you said things work with Lucene 2.0, but not 1.9, I imagine this is due to some bug in 1.9 that was fixed for 2.0. Which one, I don't know. Since 2.0 is out now, this may no longer be an issue for you. Your tool for picking terms from the .tis file sounds interesting. If you wan

Re: fastest way to get raw hit count

2006-05-30 Thread Erick Erickson
How did you store the data when you indexed it? If you didn't store it, it won't be available for retrieval (although it will be available for searching if you indexed it with, say Field.Index.TOKENIZED). See Field.Store.YES. These are for the Document.add call. Best Erick P.S. It would be bett

Re: merge factor and real time indexing

2006-05-30 Thread Otis Gospodnetic
John, I can't spot anything wrong in the code. I assume you are certain there are no IOException, no issues with locks, and that your writer is really getting close()d. I haven't used IndexModifier, and since that's a layer on top of the real IndexWriter with the same API exposed, I would just

RE: return document name as null Please help

2006-05-30 Thread Pasha Bizhan
Hi, > From: Amaresh Kumar Yadav [mailto:[EMAIL PROTECTED] > > I am using attached source code for indexing on windows plateform. > after compiling and making war file, build folder is created I guess you use lucene demo for files not for html files. In this case documents don't contain the

merge factor and real time indexing

2006-05-30 Thread John Wang
Hi folks: I am working on an application that requires real time indexing, e.g. for every insert, I open the writer, add a document and then closes the writer. I want to control the number of files created, and according to the documentation, a small mergeFactor is desired. However, I am e

RE: return document name as null Please help

2006-05-30 Thread Pasha Bizhan
Hi, > From: Amaresh Kumar Yadav [mailto:[EMAIL PROTECTED] > > do i need specify Store.YES on comond prompt ??? > > How, please make it more clear ? Your source code for indexing contains something like this: - doc.add(new Field(filedName,

RE: return document name as null Please help

2006-05-30 Thread Amaresh Kumar Yadav
Thanks Pasha, do i need specify Store.YES on comond prompt ??? How, please make it more clear ? Regards.. Amaresh -Original Message- From: Pasha Bizhan [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 30, 2006 4:18 PM To: java-user@lucene.apache.org Subject: RE: return document name a

RE: return document name as null Please help

2006-05-30 Thread Pasha Bizhan
Hi, > From: Amaresh Kumar Yadav [mailto:[EMAIL PROTECTED] > > do we need some setting in any jsp or other file for document ??? You need to specify Stored (Store.YES) attribute for a field during indexing. Pasha Bizhan - T

RE: return document name as null Please help

2006-05-30 Thread Amaresh Kumar Yadav
do we need some setting in any jsp or other file for document ??? Regards.. Amaresh -Original Message- From: Pasha Bizhan [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 30, 2006 2:39 PM To: java-user@lucene.apache.org Subject: RE: return document name as null Please help Hi, > From: Am

Re: Maybe a bug of lucene 1.9

2006-05-30 Thread hu andy
2006/5/29, hu andy <[EMAIL PROTECTED]>: I indexed a collection of Chinese documents. I use a special segmentation api to do the analysis, because the segmentation of Chinese is different from English. A strange thing happened. With lucene 1.4 or lucene 2.0, it will be all right to retrieve

RE: return document name as null Please help

2006-05-30 Thread Pasha Bizhan
Hi, > From: Amaresh Kumar Yadav [mailto:[EMAIL PROTECTED] >when i search for a particular text by lucene search > engine. I get correct number of document for that word but > its document name and summary is return as null. Are your document's name and summary Stored (Store.YES)?

return document name as null Please help

2006-05-30 Thread Amaresh Kumar Yadav
Hi all, when i search for a particular text by lucene search engine. I get correct number of document for that word but its document name and summary is return as null. display message is : Document Summary null null null null null null null null if someone has any idea about this

RE: fastest way to get raw hit count

2006-05-30 Thread Amaresh Kumar Yadav
Hi all, when i search for a particular text by lucene search engine. I get correct number of document for that word but its document name and summary is return as null. display message is : Document Summary null null null null null null null null if someone has any idea about this