Re: incremental update of index

2009-05-14 Thread komali
If u want to reindex already that was indexed then just give create flag as false ChadDavis wrote: > > In the FAQ's it says that you have to do a manual incremental update: > > How do I update a document or a set of documents that are already indexed? >> >> There is no direct update proc

Re: Getting an IndexReader from a committed IndexWriter

2009-05-14 Thread Jason Rutherglen
Hi Shay, I think IndexWriter.getReader from LUCENE-1516 in trunk is what you're talking about? It pools readers internally so there's no need to call IndexReader.reopen, one simply calls IW.getReader to get new readers containing recent updates. -J BTW I replied to the message on java-u...@lucen

Re: Issues with escaping special characters

2009-05-14 Thread Ari Miller
I buy your theory that StandardAnalyzer is breaking up the stream, and that this might be an indexing issue, rather than a query issue. When I look at my index in Luke, as far as I can tell the literal (Parenth+eses is stored, not the broken up tokens. Also, I can't seem to find an Analyzer that

Re: Issues with escaping special characters

2009-05-14 Thread Erick Erickson
I suspect that what's happening is that StandardAnalyzer is breaking your stream up on the "odd" characters. All escaping them on the query does is insure that they're not interpreted by the parser as (in this case), the beginning of a group and a MUST operator. So, I claim it correctly feeds (Pare

Issues with escaping special characters

2009-05-14 Thread Ari Miller
Say I have a book title, literally: (Parenth+eses How would I do a search to find exactly that book title, given the presence of the ( and + ? QueryParser.escape isn't working. I would expect to be able to search for (Parenth+eses [exact match] or (Parenth+e [partial match] I can use QueryPars

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
Yep, was a problem with the maven artifact publishing. Not an actual problem... jayson.minard wrote: > > Does look like the publishing from Solr ANT went to wrong place and wasn't > being picked up by Maven for the main build. > > Confirming but this is most likely a non-issue. > > --j > >

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
Does look like the publishing from Solr ANT went to wrong place and wasn't being picked up by Maven for the main build. Confirming but this is most likely a non-issue. --j jayson.minard wrote: > > clearing our maven repo and rebuilding to be sure. > > > Yonik Seeley-2 wrote: >> >> On Thu,

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
clearing our maven repo and rebuilding to be sure. Yonik Seeley-2 wrote: > > On Thu, May 14, 2009 at 2:01 PM, jayson.minard > wrote: >> >> When using the Solr trunk tip, get this error now when reading an index >> created by Solr with Lucene directly: >> >> java.lang.IndexOutOfBoundsException:

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
It is the reverse case, created with solr, optimized and then in the post-optimize listener open snapshot with lucene. Code used is this: Directory directory = new FSDirectory(new File(luceneIndexDir), null); reader = IndexReader.open(directory, true); // open directory read-only ... reader

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread Yonik Seeley
On Thu, May 14, 2009 at 2:16 PM, Yonik Seeley wrote: > Hmmm, OK... so you created the index with Lucene and are reading it with Solr. > What version of Lucene did you use to create? Oops, vise versa, right? -Yonik - To unsubscr

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
This is with an optimized index since the code runs just after optimization. jayson.minard wrote: > > When using the Solr trunk tip, get this error now when reading an index > created by Solr with Lucene directly: > > java.lang.IndexOutOfBoundsException: Index: 24, Size: 0 > at java.util

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread Yonik Seeley
On Thu, May 14, 2009 at 2:01 PM, jayson.minard wrote: > > When using the Solr trunk tip, get this error now when reading an index > created by Solr with Lucene directly: > > java.lang.IndexOutOfBoundsException: Index: 24, Size: 0 >        at java.util.ArrayList.RangeCheck(ArrayList.java:547) >    

Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
When using the Solr trunk tip, get this error now when reading an index created by Solr with Lucene directly: java.lang.IndexOutOfBoundsException: Index: 24, Size: 0 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org

Re: Question wrt Lucene analyzer for different language

2009-05-14 Thread Robert Muir
I would say in general, yes. when i say 'change arabic text', I mean the arabic analyzer will standardize and stem arabic words. but it won't modify any of your english words. and no, there is no case in arabic. this is why if you are handling mixed arabic/english text I recommend creating a cust

Re: Question wrt Lucene analyzer for different language

2009-05-14 Thread weidong sun
Because the exactly same reason, our assumption is that users' profile information is mixed with only that particular language and English. And we'll have to use the analyzer for that particular language to do the indexing and searching. And that's the reason why I asked this analyzer question. :-)

RE: Question wrt Lucene analyzer for different language

2009-05-14 Thread Uwe Schindler
> Thanks for the quick answer. :-) > > So can I say, for ArabicAnalyzer, generally it can tokenize the mixed > content with Arabic and English? :-) > > I am not really familiar with Arabic language. What do you mean for > "change > Arabic tokens"? Does Arabic has something like upper/lower case

RE: Question wrt Lucene analyzer for different language

2009-05-14 Thread Uwe Schindler
There are two problems: a) Currently there is no such analyzer (I have the problem, too, I would also like to autodetect the language from a text like M$ Word does and switch the analyzers). b) If such an autodetect analyzer exists, you will have a problem on the searching side, because you should

Re: Question wrt Lucene analyzer for different language

2009-05-14 Thread weidong sun
Thanks for the quick answer. :-) So can I say, for ArabicAnalyzer, generally it can tokenize the mixed content with Arabic and English? :-) I am not really familiar with Arabic language. What do you mean for "change Arabic tokens"? Does Arabic has something like upper/lower case as English does?

Re: Question wrt Lucene analyzer for different language

2009-05-14 Thread weidong sun
Thanks for the suprising quick response. :-) What I mean "correctly" here is that the specific analyzer can tokenize a text mixed with English and that sepcfic langauge, for example, "12345 " or "Text???" (where '?' is a character of that specific language and "12345" and "Text" is english

Re: Question wrt Lucene analyzer for different language

2009-05-14 Thread Robert Muir
in the case of ArabicAnalyzer it will only change Arabic tokens, and will leave english words as-is (it will not convert them to lowercase or anything like that) so if you want to have good Arabic and English behavior you would want to create a custom analyzer that looks like Arabic analyzer but a

Re: Question wrt Lucene analyzer for different language

2009-05-14 Thread Erick Erickson
No. What is "correctly"? Are you stemming? in which case using thesame analyzer on different languages will not work. This topic have been discussed on the user list frequently, so if you searched that archive (see: http://wiki.apache.org/lucene-java/MailingListArchives) you'd find a wealth of inf

Re: Getting a score of a specific document

2009-05-14 Thread Erick Erickson
Hmmm, come to think of it, if you pass the Filter to the search I*think* you don't get scores for that clause, but you may want to check it out... So I think you should think about implementing a HitCollector and collect only the documents you care about. This is really very little extra work sin

Question wrt Lucene analyzer for different language

2009-05-14 Thread weidong sun
Hello, I am a newbie in Lucene world. I might ask some obvious question which unfortunately I don't know the answer. Please help me 'grow'. We have a project intend to use Lucene search engine for search some user's info stored our system. The user info might not be in English even it will be sto

Re: Getting a score of a specific document

2009-05-14 Thread liat oren
Yes, I have a pre-defined list of documents that I care about. Then I can do the search on these, but it will take the statictics of the whole index, right? 2009/5/14 Erick Erickson > I don't know if I'm understanding what you want, but if you havea > pre-defined list of documents, couldn't y

Re: Help with phrase indexing

2009-05-14 Thread Asbjørn A . Fellinghaug
Seid Mohammed: > I need this exactly solution. > Can you please tell me how could I DO IT? > I am badly in nead of it > > On Thu, May 14, 2009 at 5:58 AM, Ridzwan Aminuddin > > wrote: > > > >> Hi all, > >> > >> Is Lucene able to index phrases instead if individual terms? If it is, can > >> we also

Re: Getting a score of a specific document

2009-05-14 Thread Erick Erickson
I don't know if I'm understanding what you want, but if you havea pre-defined list of documents, couldn't you form a Filter? Then your results would only be the documents you care about. If this is irrelevant, perhaps you could explain a bit more about the problem you're trying to solve. Best Eri

Re: Help with phrase indexing

2009-05-14 Thread Seid Mohammed
I need this exactly solution. Can you please tell me how could I DO IT? I am badly in nead of it On 5/14/09, Anshum wrote: > If I'm interpreting your need correctly, you want to index untokenized > strings, is it? Even if you aren't looking for untokenized indexing, you > could always use/design

Re: analysis filter wrapper

2009-05-14 Thread Joel Halbert
You can use your Analyzer to get a token stream from any text you give it, just like Lucene does. Something like: String text = "your list of words to analyze and tokenize"; TokenStream ts = YOUR_ANALYZER.tokenStream(null, new StringReader(text)); Token token = new Token(); while((ts.next(tok

analysis filter wrapper

2009-05-14 Thread Marek Rei
Hi, I'm rather new to Lucene and could use some help. My Analyzer uses a set of filters (PorterStemFilter, LowerCaseFilter, WhitespaceTokenizer). I need to replicate the effect of these filters outside of the normal Lucene pipeline. Basically I would like to input a String from one end and get a

Getting a score of a specific document

2009-05-14 Thread liat oren
Hi, I have a big index and I want to get for a specific search only the grades of a list of documents. Is there a better way to get this score than looping on all the reasults set? Thanks, Liat

Re: IndexWriter stopped before commit

2009-05-14 Thread liat oren
Thanks Mike, I will do that 2009/5/13 Michael McCandless > Unfortunately, no. > > If the JRE crashes/exits without IndexWriter.commit (or close) being > called, then the index will reflect none of the changes during that > session. > > There will be partial files in there (that's why you see so

Re: Boosting query - debuging

2009-05-14 Thread liat oren
No, As I wrote above For finlin, 6621468 * 6, 5265266 * 12 (I use payload for this) and TTD - 6621468 * 3 (I use payload for this) I search for 6621468 * 3 and it and finlin gets a higher score 2009/5/13 Grant Ingersoll > > On May 13, 2009, at 3:04 AM, liat oren wrote: > > Thanks a lot, Grant