Re: Does Lucene search over memory too?

2007-05-28 Thread Doron Cohen
Antony Bowesman <[EMAIL PROTECTED]> wrote on 28/05/2007 22:48:41: > I read the new IndexWriter Javadoc and I'm unclear about this > autocommit. In > 2.1, I thought an IndexReader opened in an IndexSearcher does not "see" > additions to an index made by an IndexWriter, i.e. maxDoc and > numDocs re

Re: Does Lucene search over memory too?

2007-05-28 Thread Antony Bowesman
Michael McCandless wrote: The "autoCommit" mode for IndexWriter has not actually been released yet: you can only use it on the trunk. It actually serves a different purpose: it allows you to make sure your searchers do not see any changes made by the writer (even the ones that have been flushed)

Re: WhitespaceAnalyzer [was: Re: regaridng Reader.terms()]

2007-05-28 Thread Mohammad Norouzi
Hi Chris, * It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F'). * It is '\u0009', HORIZONTAL TABULATION. * It is '\u000A', LINE FEED. * It is '\u000B', VERTICAL

Re: maxDoc and arrays

2007-05-28 Thread Carlos Pita
Hi again, On 5/24/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: Currently, a deleted doc is removed when the segment containing it is involved in a segment merge. A merge could be triggered on any addDocument(), making it difficult to incrementally update anything. sorry but is the document

Re: Lucene javadoc not up-to-date?

2007-05-28 Thread Chris Hostetter
: For instance, the SegmentTermDocs class implements the TermDocs interface. : However, there is no information about this SegmentTermDocs class in the : javadoc. SegmentTermDocs is a package protected class, so it's javadocs are not exposed (it cannot be directly used by clients, so there is no r

Lucene javadoc not up-to-date?

2007-05-28 Thread Tao Cheng
I've encountered a few discrepcies between the javadoc of Lucene and the source code. I use: http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/ as the most up-to-date javadoc reference. For instance, the SegmentTermDocs class implements the TermDocs interface. However, there is

Does Lucene search over memory too?

2007-05-28 Thread senthil kumaran
Hi, Does Lucene search FSDirectory as well as buffered in-memory docs while we are calling searcher.search(query)? Why I'm asking this is, I've indexed my doc with mergeFactor & Max.Buff.Docs = 50 and I've optimized and closed it at mid-night only.Beforeoptimization, my search gives partial

overdogg.com - Start Your Own Competition!

2007-05-28 Thread Sherman Monroe
Hi Friends, I'd like to announce the Beta release of my new service, overdogg.com - The Buyer's Marketplace: "overdogg is an online community centered around user-sponsored competitions for services. overdogg pioneers the online buyer's market by developing a Web-based community in which buyers

RE: Does Lucene search over memory too?

2007-05-28 Thread Michael McCandless
Just to clarify: the answer to the original question is "no". The searchers only see what's been flushed to the index directory, so, they will not search those docs still buffered in RAM of the writer. Calling writer.flush() and then recreating your searchers should work. The "autoCommit" mode f

RE: Does Lucene search over memory too?

2007-05-28 Thread Ard Schrijvers
Hello, think you can find your answer in the IndexWriter API: http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/index/IndexWriter.html The optional autoCommit argument to the constructors controls visibility of the changes to IndexReader instances reading

Re: synchronize hits variable?

2007-05-28 Thread Mark Miller
Turns out Hits will only cache 200 documents max actually. It would be nice to be able to set that (I'd like to set it to 0). It would be also nice if you could control the number that triggers a re-search. I have made my own Hits class to drop both of those features, but it would be cool if t

Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters

2007-05-28 Thread Mark Miller
FrenchAnalyzer has a stemmer built in. You are seeing the result of that stemmer in action. If you would not like to stem, you should take a look at the code for FrenchAnalyzer and copy it to make your own...just remove the FrenchStemming filter. - Mark Jolinar13 wrote: Finally, I use the st

Re: synchronize hits variable?

2007-05-28 Thread Mark Miller
You do not want to be using Hits. Frankly, the way pagination should normally be done, Hits caching means almost nobody really wants to be using Hits, but in your case it's even worse. Look into a Hit collector -- not only are you caching every single document in your search results, but you ar

Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters

2007-05-28 Thread Jolinar13
Finally, I use the standard analyzer with some custom stop words : le,la,les,l',un,une,des,d',à,au,de,et,en,dans,se,sont,qui,a,est,il,pour,que,du,sa,par,mais,sur,avec,aux,ce,d,s,l,ou,pas,ses Thanks anyway Florian Jolinar13 wrote: > > It looks like it remove the letter in the end, if it ends wit

RE: synchronize hits variable?

2007-05-28 Thread John Powers
Thanks for the response. Its definitely the user search object's search(). I have to iterate through all the hits that come back to get all the categories used in the results, so the number that hits gets really doesn't matter--ill need them all. -Original Message- From: Otis Gospo

Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters

2007-05-28 Thread Jolinar13
It looks like it remove the letter in the end, if it ends with an 'a', 'e' or 'i'. Femelles => all:femel Is this expected? How to use FrenchAnalyzer? Thanks Florian Jolinar13 wrote: > > Some terms I tested : > vehicle => all:vehicl > vehiCle => all:vehicle > Vehicle => all:vehicl > VeHicle => a

Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters

2007-05-28 Thread Jolinar13
Some terms I tested : vehicle => all:vehicl vehiCle => all:vehicle Vehicle => all:vehicl VeHicle => all:vehicle VEHICLE => all:vehicle vehicles => all:vehicl paris => all:par :S Jolinar13 wrote: > > Thanks to Luke, I realized my terms were not parsed correctly, and this > has nothing to do with

Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters

2007-05-28 Thread Jolinar13
Thanks to Luke, I realized my terms were not parsed correctly, and this has nothing to do with upper case! It seems to happen when the word ends with "ni". For example "giovanni" is parsed "giovann". Something about this? Florian Jolinar13 wrote: > > Hello Mark! > Thank you a lot for your answe

Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters

2007-05-28 Thread Jolinar13
Hello Mark! Thank you a lot for your answer. You are right for the Luke part. My Luke version was too old. My bad. But with Luke I still observe the problem I described. Any idea how to sort this out? Thank you Florian I got strange search results on strings in uppercase. (example : VEH

Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters

2007-05-28 Thread Mark Miller
FrenchAnalyzer does lowercase and using it would not in anyway alter Lukes ability to read your index. - Mark Jolinar13 wrote: Hello Erick, Still no idea about my problem? Anybody here using the FrenchAnalyzer? Thanks, Florian Jolinar13 wrote: Hello, Thank you for your quick answer. I us

Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters

2007-05-28 Thread Jolinar13
Hello Erick, Still no idea about my problem? Anybody here using the FrenchAnalyzer? Thanks, Florian Jolinar13 wrote: > > Hello, > Thank you for your quick answer. > I use Luke to examine the index, but since I switched to FrenchAnalyzer, > it says 'Not a Lucene index'. > If I open the index fil

Does Lucene search over memory too?

2007-05-28 Thread SK R
Hi, Does Lucene search FSDirectory as well as buffered in-memory docs while we are calling searcher.search(query)? Why I'm asking this is, I've indexed my doc with mergeFactor & Max.Buff.Docs = 50 and I've optimized and closed it at mid-night only.Beforeoptimization, my search gives partial