Re: escaping special characters

2008-08-06 Thread Aravind . Yarram
can i escape built in lucene keywords like OR, AND aswell? Regards, Aravind R Yarram Chris Hostetter <[EMAIL PROTECTED]> 08/06/2008 07:05 PM Please respond to java-user@lucene.apache.org To java-user@lucene.apache.org cc Subject Re: escaping special characters : String escapedKe

Re: bad index by batch indexing

2008-08-06 Thread yanyanzeng
hi, thank you very much for your reply. I am using the latest version, lucene 2.3.2. I will try using two arguments and post my result later. yanyan Anshum-2 wrote: > > This really seems like an issue the batching mechanism (one of those > errors > which seem trivial on discovery :) ). I

Re: LineDocMaker usage

2008-08-06 Thread Anshum
Hi, How about just opening a file and parsing through it while adding doing a doc.add on each newline? That should be pretty straight and simple. Just writing the snippet here, though this might have issues as didnt try to compile it. IndexWriter writer = new IndexWriter(indexDir, new Standard

Re: bad index by batch indexing

2008-08-06 Thread Mark Miller
Only one Writer should be active on an index at any given time. You don't want to batch unless you need to see the docs in the index as you build it. - it's slower overall. You can add from multiple threads at the same time, but use the same Writer. Sent from my iPhone On Aug 6, 2008, at 8

Re: bad index by batch indexing

2008-08-06 Thread Anshum
This really seems like an issue the batching mechanism (one of those errors which seem trivial on discovery :) ). I work with batched indexing and it works absolutely fine on data that is a lot higher in magnitude. You could try calling the indexwriter without the 3rd argument and see if it helps.

bad index by batch indexing

2008-08-06 Thread yanyanzeng
Hi, I am building a search engine for text transcript documents from the database of an enterprise messaging system, and have designed a batch processing job to incrementally build the index,because the database from production is around huge, around 10G. Now I am still testing in DEV env

Re: Lucene Concurrency Issue

2008-08-06 Thread Mark Miller
Do a little research to learn the rules and you will figure out how to make those classes cooperate. You might start by looking at LUCENE-1026, which is a simple set of classes that allows for what you want. You can use it, use it to make your own, or even look at its father issue - the origina

Lucene Concurrency Issue

2008-08-06 Thread Alex Wang
Hi all, To allow mutilple users concurrently add, delete docs and at the same time search the same index, what should I watch out for in terms of initing indexreader, indexwriter and indexsearcher? My application is getting various IOException (seek failed, permission denied, etc...) when con

Re: escaping special characters

2008-08-06 Thread Chris Hostetter
: String escapedKeywords = QueryParser.escape(keywords); : Query query = new QueryParser("content", new : StandardAnalyzer()).parse(escapedKeywords); : : this works with most of the special characters like * and ~ except \ . I : can't do a search for a keyword like "ho\w" and get results. : am I

Re: LineDocMaker usage

2008-08-06 Thread Michael McCandless
The format is: title date body But this is normally only used to create documents as part of an algorithm that you run under contrib/benchmark. Mike On Aug 6, 2008, at 4:12 PM, Brittany Jacobs wrote: Hello, I am new to all this. I need to read in a text file and have each line

Re: How to get unique values for field1 where search is on field2?

2008-08-06 Thread Chris Hostetter
What you are describing sounds an awful lot lik you wantto "facet" on the Author field ... if you search the mailing list archives you'll find quite a bit of discussion on various approaches to this problem. -Hoss - To unsubs

Re: Strict Ordering of Boosted results?

2008-08-06 Thread Chris Hostetter
: word: termA^10.0 word: termB^2.0 : : I want ALL termA results (ordered by score) to come before ANY termB : results (also ordered by score). Is there a way to do this in the query : syntax? Or is this simple multiple queries? the syntax lets you specify the query boost on each clause, but th

Re: Document path in lucene index

2008-08-06 Thread Chris Hostetter
: I can print the index terms but I don't know if there is any possibilites to : print the coressbonds paths, i can just print the docid, but i need to print : the paths as it is possible in searcher (query). If you index or store the path in a field, then you can get it back out -- for the type

Re: Re-Search Hits

2008-08-06 Thread Chris Hostetter
: for performance reasons I cache Hits for a certain Query in memcached, for : things like pageination etc. My question right now is if it is possible to : re-search such a cached Hits sets. That would be great for features like : live-learch and so on. Does Lucene support that? You can hang on t

CustomScoreQuery and BooleanQuery

2008-08-06 Thread Jay
Hi, The new addition of the class CustomScoreQuery is very useful and powerful in customizing the score of one query using the indexed field values. Another feature that I am looking for in Lucene is the ability to combine the scores of multiple (sub)queries in a way different from the Boolean

LineDocMaker usage

2008-08-06 Thread Brittany Jacobs
Hello, I am new to all this. I need to read in a text file and have each line in the file be a document. The LineDocMaker seems to be intended for this purpose. But I can't figure out how to read the data into it. Any examples would be greatly appreciated.

Re: Urgent Help Please: "Resource temporarily unavailable"

2008-08-06 Thread Grant Ingersoll
On Aug 6, 2008, at 3:06 PM, Alex Wang wrote: Sorry about the double posting. After sending the first email I got a delivery failure notice from [EMAIL PROTECTED] I resent it just to be sure. Unfornately there is no stack trace in the log. The error object was passed to log4j.error(...),

RE: Urgent Help Please: "Resource temporarily unavailable"

2008-08-06 Thread Alex Wang
Sorry about the double posting. After sending the first email I got a delivery failure notice from [EMAIL PROTECTED] I resent it just to be sure. Unfornately there is no stack trace in the log. The error object was passed to log4j.error(...), but no stack trace was printed out. I thought both I

Re: Urgent Help Please: "Resource temporarily unavailable"

2008-08-06 Thread Grant Ingersoll
What's the full exception? We don't even know that the exception is in Lucene from what you've described. So, w/o more info, it will be pretty hard to help, but if I had to guess, it sounds like you've got some threading problems, but who knows. Also, no need to send the exact same email

Urgent Help Please: "Resource temporarily unavailable"

2008-08-06 Thread Alex Wang
Hi Everyone, We have an application built using Lucene 1.9. The app allows incremental updating to the index while other users are searching the same index. Today, some search suddenly returns nothing when we know it should return some hits. This does not happen all the time. Sometimes the sea

Urgent Help Please: "Resource Tempararily Unavailable"

2008-08-06 Thread Alex Wang
Hi Everyone, We have an application built using Lucene 1.9. The app allows incremental updating to the index while other users are searching the same index. Today, some search suddenly returns nothing when we know it should return some hits. This does not happen all the time. Sometimes the sear

Re: failed to open an indexer after about 20 queries

2008-08-06 Thread Marcus Herou
And I'm sure you have a construct like: try { reader = open() } finally { if(reader != null) reader.close() } right ? As Grant says should you hold the reader open as long as possible since it caches a lot of stuff. Look at a SearcherCache: http://dev.tailsweep.com/svn/core-lucene/trunk/s

Re: search with accent not match

2008-08-06 Thread Mark Miller
You certainly can - just create your own Analyzer starting with a copy of the French one you are using. Then you just plug in the filter in the order you want it applied: result = new ISOLatin1AccentFilter(result); You have to decide for yourself where it will come - if you put it before the

Re: search with accent not match

2008-08-06 Thread Christophe from paris
Actualy in my FrenchAnalyser i have : TokenStream result = new StandardTokenizer(reader); result = new StandardFilter(result); result = new StopFilter(result, stoptable); result = new FrenchStemFilter(result, excltable); result = new LowerCaseFilter(result); I can use ISOLati

Re: search with accent not match

2008-08-06 Thread Mark Miller
Check out org.apache.lucene.analysis.ISOLatin1AccentFilter It will strip diacritics - just be sure to use it at index time and query time to get what you want. Also, you will no longer be able to differentiate between the two in your searching (rarely that important in my opinion, but others c

search with accent not match

2008-08-06 Thread Christophe from paris
Hello I'm use FrenchAnalyzer for index IndexWriter writer = new IndexWriter(pathOfIndex, new FrenchAnalyzer(), true); Document = new Document(); doc.add(new Field("TXT_CHARACT_VALUE",word.toLowerCase(),Field.Store.YES,Field.Index.TOKENIZED)); writer.addDocument(doc); And search IndexReader re

Re: Per user data store

2008-08-06 Thread Karsten F.
Hi, I want to agree with the advice of using only one index. And I want to add two reasons: 1. Sorting and caching are working with the lucene-document-numbers. In case of lucene "warming up" means that a lot of int-Arrays and bitsets are stored in main memory. If you using different MultiReader