Backwards compatibility issues

2019-10-08 Thread Jamie
doc values, is there an API configuration option to force Lucene 5.5 to use the old way of sorting when using indexes created by Lucene 4? How does one resolve this? Much appreciate Jamie - To unsubscribe, e-mail: java-use

upgrading lucene 4 to 6

2016-04-25 Thread Jamie
Hi Uwe Looking at the magnitude of the API changes from Lucene 4 to 6, I don't think we'll ever be able to upgrade. It seems the API has been modified to a large extent. Is there any chance that the bug fixes will back ported to the 4.0 branch? Jamie

Re: Pagination using searchAfter

2015-09-04 Thread jamie
Ganesh I would advise to consult Lucene user group archives. I asked a similar question a while back, and it was addressed. Cheers Jamie On 2015/09/04 4:56 PM, Ganesh wrote: Hi I want to use the searchAfter API in IndexSearcher. This API takes ScoreDoc as argument. Do we need to store the

Re: Lucene TermsFilter lookup slow

2015-08-18 Thread jamie
have very little impact on performance. It is taking around 25 second to look up documents associated with murmur hash string id's on an index size of 10m records. Thanks in advance Jamie On 2015/08/10 2:46 PM, Michael McCandless wrote: OK, indeed, that version has the changes I was t

Filtered docs and positions enum

2015-08-14 Thread Jamie Johnson
First sorry for the post to here and the solr list, not sure where this is most appropriately asked but since there is no response there I figured I'd try here... I have what I believe to be a fairly unique use case (as i have not seen it mentioned before) that I'm looking for some thoughts on. I

Re: Lucene TermsFilter lookup slow

2015-08-09 Thread jamie
Mike Thank you kindly for the reply. I am using Lucene v4.10.4. Are the optimization you refer to, available in this version? We haven't yet upgraded to Lucene 5 as there appear to be many API changes. Jamie On 2015/08/08 5:13 PM, Michael McCandless wrote: Which version of Lucene ar

Lucene TermsFilter lookup slow

2015-08-08 Thread jamie
Much appreciate Jamie

Re: lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
Sorry to suggest that it was a Lucene bug. It is rare we encounter Lucene bugs - a testament to your code quality. Much appreciate! On 2014/07/16, 12:33 PM, Jamie wrote: Uwe Thank you. I think your earlier hint regarding precision steps solved it. I noticed that new Long was created with a

Re: lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
che/lucene/util/NumericUtils.html#PRECISION_STEP_DEFAULT>(4). Regards Jamie On 2014/07/16, 12:24 PM, Uwe Schindler wrote: Sorry I cannot give you any hints, except that handling of Filters and Queries and per-segment search changed dramatically in Lucene 4 (see migration guide). Mayb

Re: lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
Uwe The range query works with lindexes created by 3.8.1, but not 4.0.0. To me, this indicates that Lucene 4.0.0 is not indexing these longs correctly, or something of the sort. Do you have any ideas on where to look? Jamie On 2014/07/16, 12:15 PM, Uwe Schindler wrote: Sorry, no you

Re: lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
Uwe is there anyway we can downgrade to 3.8.1 while being able to read indexes created from 4.0.0? On 2014/07/16, 11:45 AM, Uwe Schindler wrote: If you index as a long field, you have to use NumericRangeQuery to query. A simple TermRangeQuery as created by QueryParser does not work, because

Re: lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
2014/07/16, 12:07 PM, Jamie wrote: Uwe Thanks for the suggestion. When I inspect the query in Eclipse, I can clearly see that a NumericRangeQuery is constructed. Why would this query work in an older version of Lucene but not 4.0.0.? Jamie

Re: lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
Uwe Thanks for the suggestion. When I inspect the query in Eclipse, I can clearly see that a NumericRangeQuery is constructed. Why would this query work in an older version of Lucene but not 4.0.0.? Jamie On 2014/07/16, 11:45 AM, Uwe Schindler wrote: If you index as a long field, you have

Re: lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
te) value,DateUtil.minuteFormatFast); LongField docField = new LongField(indexFieldName, Long.parseLong(date), store); doc.add(docField); Any ideas? Jamie On 2014/07/16, 11:35 AM, Jamie wrote: Hi This query does not work either: QueryWrapperFilter(+archivedate:[20140708 TO 20140731]

Re: lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
bind? Jamie On 2014/07/16, 11:15 AM, Jamie wrote: Hi I have a situation whereby the following filter query works with Lucene 3.8.1 ChainedFilter: [QueryWrapperFilter(+receivedate:[20140701 TO 20140731] +cat:email) org.apache.lucene.sandbox.queries.DuplicateFilter@6a894d44 ] With

lucene 4.0.0 range query broken?

2014-07-16 Thread Jamie
. It doesn't work. Is this a bug? The problem seems to occur when receivedate range query is added. Jamie - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-u

Re: deleteDocument with NRT

2014-07-11 Thread Jamie
this is discouraged since it penalizes the unlucky queries that do the reopen. It's better to use a separate background thread, that periodically calls maybeReopen. Finally, be sure to call close once you are done. On Jul 10, 2014, at 01:56 PM, Jamie wrote: Hi I am using NRT searc

deleteDocument with NRT

2014-07-10 Thread Jamie
are still returned. What is the recommended way to obtain a near realtime search result that immediately reflect all deleted documents? Much appreciate Jamie - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org

Re: frozen in PriorityQueue.downHeap for more than 25 minutes

2014-06-23 Thread Jamie
Toke On 2014/06/23, 2:08 PM, Toke Eskildsen wrote: On Mon, 2014-06-23 at 13:53 +0200, Jamie wrote: if (startIdx==0) { topDocs = indexSearcher.search(query,queryFilter,searchResult.getPageSize(), sort); } else { topDocs = indexSearcher.searchAfter(p.startScoreDoc, query, queryFilter

Re: frozen in PriorityQueue.downHeap for more than 25 minutes

2014-06-23 Thread Jamie
Toke How does one sort the results of a collector as opposed to the entire result set? Do I need to implement my own sort algorithm or is there a way to do this with Lucene? If so, which API functions do I need to call? Thanks Jamie On 2014/06/23, 1:43 PM, Toke Eskildsen wrote: On Mon

Re: frozen in PriorityQueue.downHeap for more than 25 minutes

2014-06-23 Thread Jamie
much difference. Regards Jamie On 2014/06/23, 1:43 PM, Toke Eskildsen wrote: On Mon, 2014-06-23 at 13:33 +0200, Jamie wrote: While running a search over several million documents, the Yourkit profiler reports a deadlock on the following me

frozen in PriorityQueue.downHeap for more than 25 minutes

2014-06-23 Thread Jamie
.runWorker(ThreadPoolExecutor$Worker) java.util.concurrent.ThreadPoolExecutor$Worker.run() java.lang.Thread.run() Much appreciate Jamie - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: search performance

2014-06-20 Thread Jamie
rives. We're using the following JRE parameters: -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:SurvivorRatio=3 -XX:+AggressiveOpts. Let me know if there is anything else, we can try to obtain performance gains. Much appreciate Jamie On 2014/06/20, 9:51 AM, Jamie wrote: Hi All Thank you fo

Re: search performance

2014-06-20 Thread Jamie
impact of searching across multiple indexes? Am I correct that using SearchManager can't be used with a MultiReader and NRT? I would appreciate all suggestions on how to optimize our search performance further. Search time has become a usability issue. Much ap

Re: timing merges

2014-06-12 Thread Jamie
Erick We are not using Solr. We are using the latest version of Lucene directly. When I run it in a profiler, I can see all indexing threads blocked on merge for long stretches at a time. Regards Jamie On 2014/06/12, 4:39 PM, Erick Erickson wrote: What version of Solr/Lucene? Merging is

Re: timing merges

2014-06-12 Thread Jamie
d way to resolve this? Regards Jamie On 2014/06/12, 4:04 PM, Erick Erickson wrote: Michael is, of course, the Master of Merges... I have to ask, though, have you demonstrated to your satisfaction that you're actually seeing a problem? And that fewer merges would actually address th

timing merges

2014-06-11 Thread Jamie
ement a custom merge policy? Thank you in advance for your consideration. Jamie - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: searching with stemming

2014-06-10 Thread Jamie
Done. https://issues.apache.org/jira/browse/LUCENE-5749 On 2014/06/10, 1:18 AM, Jack Krupansky wrote: Please do file a Jira. I'm sure the discussion will be interesting. -- Jack Krupansky - To unsubscribe, e-mail: java-user-

Re: searching with stemming

2014-06-09 Thread Jamie
Jack Thanks. I figured as much. I'm modifying each analyzer with constructors that take a Stem argument: public enum Stem { AGGRESSIVE, LIGHT, NONE }; This is obviously, not ideal, 20 or more Lucene classes must be updated. I now need to maintain each analyzer. Regards Jamie On

Re: searching with stemming

2014-06-09 Thread Jamie
only thing different for each language will be the stop words, so you can have one analyzer class with a language parameter. On Jun 9, 2014 7:02 AM, "Jamie" wrote: - To unsubscribe, e-mail: java-use

Re: searching with stemming

2014-06-09 Thread Jamie
I am not using Solr. I am using the default analyzers... On 2014/06/09, 12:59 PM, Benson Margulies wrote: Are you using Solr? If so you are on the wrong mailing list. If not, why do you need a non- -anonymous analyzer at all. On Jun 9, 2014 6:55 AM, "Jamie" wrote: To me, it see

Re: searching with stemming

2014-06-09 Thread Jamie
To me, it seems strange that these default analyzers, don't provide constructors that enable one to override stemming, etc? On 2014/06/09, 12:39 PM, Trejkaz wrote: On Mon, Jun 9, 2014 at 7:57 PM, Jamie wrote: Greetings Our app currently uses language specific analysers (e.g. EnglishAna

Re: searching with stemming

2014-06-09 Thread Jamie
Benson Yes, I can of course do this, as far I can see I would have to override each analyzer. This is a pain. Regards Jamie On 2014/06/09, 12:29 PM, Benson Margulies wrote: You should construct an analysis chain that does what you need. Read the source of the relevant analyzer and pick the

searching with stemming

2014-06-09 Thread Jamie
ation. Jamie - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: search performance

2014-06-06 Thread Jamie
nearest cached position. Cheers Jamie On 2014/06/03, 3:24 PM, Jon Stewart wrote: With regards to pagination, is there a way for you to cache the IndexSearcher, Query, and TopDocs between user pagination requests (a lot of webapp frameworks have object caching mechanisms)? If so, you may have

Re: search performance

2014-06-03 Thread Jamie
Thanks Jon I'll investigate your idea further. It would be nice if, in future, the Lucene API could provide a searchAfter that takes a position (int). Regards Jamie On 2014/06/03, 3:24 PM, Jon Stewart wrote: With regards to pagination, is there a way for you to cache the IndexSea

Re: search performance

2014-06-03 Thread Jamie
Robert. Thanks, I've already done a similar thing. Results on my test platform are encouraging.. On 2014/06/03, 2:41 PM, Robert Muir wrote: Reopening for every search is not a good idea. this will have an extremely high cost (not as high as what you are doing with "paging" but still not good).

Re: search performance

2014-06-03 Thread Jamie
Robert FYI: I've modified the code to utilize the experimental function.. DirectoryReader dirReader = DirectoryReader.openIfChanged(cachedDirectoryReader,writer, true); In this case, the IndexReader won't be opened on each search, unless absolutely necessary. Regards Jami

Re: search performance

2014-06-03 Thread Jamie
Robert Hmmm. why did Mike go to all the trouble of implementing NRT search, if we are not supposed to be using it? The user simply wants the latest result set. To me, this doesn't appear out of scope for the Lucene project. Jamie On 2014/06/03, 1:17 PM, Robert Muir wrote: No, yo

Re: search performance

2014-06-03 Thread Jamie
protected IndexReader initIndexReader() { List readers = new LinkedList<>(); for (Writer writer : writers) { readers.add(DirectoryReader.open(writer, true); } return MultiReader(readers,true); } Thank you for your ideas/suggestions. Regards Jamie On 2014/06/03, 12:29 PM,

Re: search performance

2014-06-03 Thread Jamie
ed to store all scoredocs for the entire result set. When there are 60M+ results, this can be problematic in terms of memory consumption. It would be far nicer if there was a searchAfter function that took a position as an integer. Regards

Re: search performance

2014-06-03 Thread Jamie
FYI: We are also using a multireader to search over multiple index readers. Search under a million documents yields good response times. When you get into the 60M territory, search slows to a crawl. On 2014/06/03, 11:47 AM, Jamie wrote: Sure... see below

Re: search performance

2014-06-03 Thread Jamie
quot;); } catch (Exception io) { throw new BlobSearchException("failed to execute search query {searchquery='"+ getSearchQuery() + "}", io, logger, ChainedException.Level.DEBUG); } } On 2014/06/03, 11:41 AM, Rob Audenaerde wrote: Hi Jamie, What is included

Re: search performance

2014-06-03 Thread Jamie
hits, takes 5 minutes to complete. Regards Jamie On 2014/06/03, 10:54 AM, Vitaly Funstein wrote: Something doesn't quite add up. TopFieldCollector fieldCollector = TopFieldCollector.create(sort, max,true, false, false, true); We use pagination, so only returning 1000 documents or so

Re: search performance

2014-06-03 Thread Jamie
Toke Thanks for the contact. See below: On 2014/06/03, 9:17 AM, Toke Eskildsen wrote: On Tue, 2014-06-03 at 08:17 +0200, Jamie wrote: Unfortunately, in this instance, it is a live production system, so we cannot conduct experiments. The number is definitely accurate. We have many different

Re: search performance

2014-06-02 Thread Jamie
integration code is fairly well optimized. I've requested access to the indexes so that we can perform further testing. Regards Jamie On 2014/06/03, 8:09 AM, Toke Eskildsen wrote: On Mon, 2014-06-02 at 08:51 +0200, Jamie wrote: [200GB, 150M documents] With NRT enabled, search speed is roug

Re: search performance

2014-06-02 Thread Jamie
ry using a MMapDirectory and see if that improves performance. Also, regarding the pagination, you said you're retrieving 1000 documents at a time. Does that mean that if a query matches 1 documents you want all of them retrieved ? On Mon, Jun 2, 2014 at 12:51 PM, Jamie wrote:

Re: search performance

2014-06-02 Thread Jamie
I was under the impression that NRTCachingDirectory will instantiate an MMapDirectory if a 64 bit platform is detected? Is this not the case? On 2014/06/02, 2:09 PM, Tincu Gabriel wrote: MMapDirectory will do the job for you. RamDirectory has a big warning in the class description stating that

Re: search performance

2014-06-02 Thread Jamie
the data? How frequent are your commits for updates while doing queries? Around ten to fifteen documents are being constantly added per second. Thank again Jamie - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org

Re: search performance

2014-06-02 Thread Jamie
index use ? Not sure exactly what you are referring to here. We do have alot of stored fields (to, from bcc, cc, etc.). The body and attachments are analyzed. Regards Jamie - To unsubscribe, e-mail: java-use

search performance

2014-06-01 Thread Jamie
earch project on the Lucene level that we could use. I realize this is a rather vague question, but are there any further suggestions on ways to improve search performance? We need cheap and dirty ideas, as well as longer term advice on a possible path forward. Mu

Re: writer.updateDocument() not working (possible bug?)

2014-05-19 Thread Jamie
Michael Thanks for the clarification. This is a hefty limitation of the Lucene. One would expect, that you would be able to update a specific field in the index without having to reindex the entire document. Regards Jamie On 2014/05/16, 11:34 PM, Michael McCandless wrote: You can

Re: writer.updateDocument() not working (possible bug?)

2014-05-16 Thread Jamie
Michael How do you update a document that resides in the index without having the original document? Jamie On 2014/05/13, 3:30 PM, Michael McCandless wrote: How did you produce the document that you are sending to updateDocument? Are you loading it from IndexReader.document() or

writer.updateDocument() not working (possible bug?)

2014-05-15 Thread Jamie
to call commit() and/or close() immediately after the update, but it makes no difference. This occurs both in Lucene 4.7.2 and 4.8. As far as we know, our code used to work with prior versions of Lucene. Has anyone encountered this? Regards Jamie

Re: writer.updateDocument() not working (possible bug?)

2014-05-13 Thread Jamie
. Can you confirm whether the full document will be returned when I call searcher.doc(scoreDoc.doc)? if not, what is the recommended way to get the original document from the index (if possible)? Perhaps, reader.document(i)? Thanks in advance Jamie On 2014/05/13, 3:30 PM, Michael McCandless

Re: Lucene 4.7 intermittently not applying query filter

2014-03-28 Thread Jamie
, the filter is ignored. I am busy trying to isolate the issue, since the code is running in a wider system among other complexities. Jamie On 2014/03/28, 4:08 PM, Steve Rowe wrote: Hi Jamie, What does EmailFilter do? Why is the expanded form "required for the UAX29URLEmailToke

Re: Lucene 4.7 intermittently not applying query filter

2014-03-28 Thread Jamie
back to you. Thank you. On 2014/03/28, 5:28 PM, Steve Rowe wrote: Jamie, UAX29URLEmailTokenizer does not emit email components as tokens; “john@mycompany.com.au” will be tokenized as “john@mycompany.com.au”, nothing more. That’s why I asked what EmailFilter does. If the filter

Lucene 4.7 intermittently not applying query filter

2014-03-28 Thread Jamie
applied it, but it makes no difference. I tried to downgrade Lucene, but it wont read the 4.6 indexes. Can anyone suggest a way forward? Thanks for your recommendations Jamie - public final class EmailAnalyzer extends StopwordAnalyzerBase { public st

Re: Limiting the fields a user can query on

2014-02-20 Thread Jamie Johnson
I would be fine with throwing a parse exception or excluding the particular clause. I will look at the StandardQueryNodeProcessorPipeline as well as Hoss' suggestion. Thank you very much! On Thu, Feb 20, 2014 at 4:20 AM, Trejkaz wrote: > On Thu, Feb 20, 2014 at 1:43 PM, Jamie Johnso

Limiting the fields a user can query on

2014-02-19 Thread Jamie Johnson
Is there a way to limit the fields a user can query by when using the standard query parser or a way to get all fields/terms that make up a query without writing custom code for each query subclass?

Re: Lucene 4.0: Custom Query Parser newTermQuery(Term term) override

2012-07-11 Thread Jamie
Yonik Thanks for the tip.However, from what I can see, I still need to return a TermQuery specific to each data type. Does anyone know how to convert a string value to TermQuery for each data type? Jamie On 2012/07/11 3:42 PM, Yonik Seeley wrote: On Wed, Jul 11, 2012 at 9:34 AM, Jamie

Lucene 4.0: Custom Query Parser newTermQuery(Term term) override

2012-07-11 Thread Jamie
(term .text(); for the various field type. A quick pointer would be most appreciated. Thanks Jamie public class CustomQueryParser extends QueryParser { @Override protected Query newTermQuery(Term term) { if (term.field().equals("uid")) { return super.ne

Re: Store a query in a database for later use

2012-05-17 Thread Jamie Johnson
I think you want to have a look at the QueryParser classes. Not sure which you're using to start with but probably the default QueryParser should suffice. On Thu, May 17, 2012 at 3:53 PM, Stefan Undorf wrote: > Hi, > > I want to store a query for later use in a database, like: > > 1. queryToPers

Re: combine results from multiple queries & sort

2012-03-14 Thread Jamie
y,idFilter,tfc); Regards Jamie On 2012/03/14 12:44 PM, Li Li wrote: it's a very common problem. many of our users(including programmers that familiar with sql) have the same question. comparing with sql, all queries in lucene are based on inverted index. fortunately, when searching, we

combine results from multiple queries & sort

2012-03-14 Thread Jamie
ts are unsorted after they are combined into a single linkedlist. What is the best way to sort the combined results based upon any chosen field in the lucene index? Is there a way to do that would leverage Lucene's inbuilt sort abilities? Many thanks for your consideration Jamie

Re: query for documents WITHOUT a field?

2012-02-16 Thread Jamie Johnson
Another possible solution is while indexing insert a custom token which is impossible to show up in the index otherwise, then do the filter based on that token. On Thu, Feb 16, 2012 at 4:41 PM, Uwe Schindler wrote: > As the documentation states: > Lucene is an inverted index that does not have p

effectiveness of compression

2012-02-15 Thread Jamie
ady exists in the index? I am worried about our index size growing too large when pursuing this strategy (i.e. one of creating a new Lucene document for every version of a file). Many thanks for your consideration. Jamie --

Re: comparing index fields within a query

2012-01-23 Thread Jamie
pabilities? Jamie On 2012/01/23 2:21 PM, Ian Lea wrote: I guess you could do it in a custom Collector. They get passed readers and docids so you could do the lookups and comparison. There will be performance implications that you may be able to minimise via FieldCache. Storing the result in

comparing index fields within a query

2012-01-23 Thread Jamie
my case, it is not possible to store the result of deleted_date>modified_date at the time of indexing. Thanks Jamie - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-

Re: Lucene 4.0 Index Format Finalization Timetable

2011-12-07 Thread Jamie Johnson
s we need to stick w/3.x for now.  You might be in a > different situation if you really need the 4.0 changes.  Maybe you can just > stick w/the current trunk and take responsibility for patching critical > bugfixes, hoping you won't have to recreate your index too many times... > &g

Re: Lucene 4.0 Index Format Finalization Timetable

2011-12-06 Thread Jamie Johnson
p://8ball.tridelphia.net/ > > > On 12/06/2011 08:46 PM, Jamie Johnson wrote: >> >> Thanks Robert.  Is there a timetable for that?  I'm trying to gauge >> whether it is appropriate to push for my organization to move to the >> current lucene 4.0 implementation

Re: Lucene 4.0 Index Format Finalization Timetable

2011-12-06 Thread Jamie Johnson
rently on trunk. I'm not looking for anything hard, just trying to plan as much as possible understanding that this is one of the implications of using trunk. On Tue, Dec 6, 2011 at 6:48 PM, Robert Muir wrote: > On Tue, Dec 6, 2011 at 6:41 PM, Jamie Johnson wrote: >> Is there a time

Lucene 4.0 Index Format Finalization Timetable

2011-12-06 Thread Jamie Johnson
Is there a timetable for when it is expected to be finalized? I'm not looking for an exact date, just an approximate like (next month, 2 months 6 months,etc) - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For a

Re: File Handle Leaks During Lucene 3.0.2 Merge

2010-09-30 Thread Jamie
to be expected when using NRT search? I am pretty certain that all Searchers are released at the end of every search. I double checked it at least twenty times. Jamie On 2010/09/30 11:56 PM, Michael McCandless wrote: On Thu, Sep 30, 2010 at 5:59 AM, Jamie wrote: Hi Michael / Uwe It&

Re: File Handle Leaks During Lucene 3.0.2 Merge

2010-09-30 Thread Jamie
e to delete the files. I think its because our new code never closes the indexwriter, but rather uses the indexwriter.commit() method to apply the changes. Is this correct? Jamie - To unsubscribe, e-mail: java-user-unsubscr..

Re: File Handle Leaks During Lucene 3.0.2 Merge

2010-09-30 Thread Jamie
writer stays open, handles left by merge operations are never deleted. A solution is too close the index periodically to force the handles to be swept up by the OS. Jamie On 2010/09/30 10:55 AM, Uwe Schindler wrote: The finalize() thing does not work correctly, as the reader holds still

Re: File Handle Leaks During Lucene 3.0.2 Merge

2010-09-29 Thread Jamie
something wrong, but what? Jamie On 2010/09/29 8:21 PM, Uwe Schindler wrote: The "deleted" files are only freed by OS kernel if no longer an IndexReader accesses them. Did you get a new realtime reader after merging and*closed* the old one? - Uwe Schindler H.-H.-Meier-Allee 63,

File Handle Leaks During Lucene 3.0.2 Merge

2010-09-29 Thread Jamie
Hi There I am noticing file handle leaks appearing on Index files. I think the leaks occur during the Lucene merge operation. Lsof reports the following: java 28604 root 213r REG 8,33 1098681 57409621 /var/index/vol201009/_a4w.cfs (deleted) java 28604 roo

Re: [Fwd: Re: Lucene 3.0 Search Performance Stats]

2010-03-22 Thread Jamie
second?) - if you have any custom analyzers, optimize them for efficiency If you have realtime indexes - cache index readers for a few seconds at a time Regards, Jamie On 2010/03/22 03:12 PM, suman.hol...@zapak.co.in wrote: Hi , I am also using range based searches for dates .I am

Re: Lucene 3.0 Search Performance Stats

2010-03-22 Thread Jamie
Hi Everyone The stats I sent through earlier were erroneous due to fact the date range query selected fewer records than stated. The correct stats are: Lucene 3.0 Stats: Search conducted using Lucene's Realtime search feature (writer.getReader() for each search) Analyzer: Russian Analyzer

Re: Lucene 3.0 Search Performance Stats

2010-03-20 Thread Jamie
ory consumption goes through the roof. Using Numerics avoids this problem. There are lot of other strategies we used to reduce memory consumption. Like, making sure that you are caching Searchers and IndexReaders, etc. Regards, Jamie On 2010/03/19 07:04 PM, Monique Monteiro wrote: Hi Jamie

Re: Lucene 3.0 Search Performance Stats

2010-03-19 Thread Jamie
I forgot to point out, this is a search using the Lucene realtime search feature. We get the reader from indexwriter.getReader() for each search. On 2010/03/19 01:49 PM, Jamie wrote: Hi Guys I just wanted to congratulate the Lucene guys for a fine job on 3.0!! Since we switched our indexes

Lucene 3.0 Search Performance Stats

2010-03-19 Thread Jamie
Index Size: 24 GB (non optimized) Search Speed: 0.06 - 0.09 seconds (with sort YYMMHHSS date) Index stored on 4 SAS HDD hitachi RAID 10 16G RAM 2x Xeon 4 core 2.4Gz OS FreeBSD 7.2 Filesystem UFS2 gjournal I believe we are using all search performance recommendations now. Good job! Jamie

Re: OutOfMemory ParallelMultisearcher

2010-03-17 Thread Jamie
d what it all means. If I change my date string to a Numeric integer in the format MMddHHmm, what should the precisionStep value be? Jamie On 2010/03/17 01:20 PM, Ian Lea wrote: Hi Caching searchers at some level should help keep memory usage down - and will help performance too.

OutOfMemory ParallelMultisearcher

2010-03-16 Thread Jamie
ParallelMultisearcher tend to consume a large amount of memory by itself when used on large indices? If so, do you have any suggestions on how I might support the above scenario (i.e. when the indexes used change from one query to the next) Thanks in advance Jamie

Contrib Lucene Analyzers & Stemming

2010-02-10 Thread Jamie
using Lucene 2.9.1, stemming appeared to work and after having upgraded to Lucene 3.0, it does not. Thanks in advance Jamie - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail

Unexpected Query Results

2010-02-03 Thread Jamie
ivedate:время receiveddate:время from:время to:время subject:время cc:время bcc:время deliveredto:время flag:время sensitivity:время sender:время recipient:время body:время attachments:время attachname:время memberof:время )) I am not sure why query 2 returns 0 hits. In my mind it should return 48 hits a

Re: Email Filter using Lucene 3.0

2010-01-29 Thread Jamie
Hi Uwe Thanks so much for your help. I now understand Token Filters much better and your suggestion worked! Here's the code for anyone else who is interested. import org.apache.commons.logging.*; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.TokenFilter; imp

Email Filter using Lucene 3.0

2010-01-29 Thread Jamie
Hi THere In the absence of documentation, I am trying to convert an EmailFilter class to Lucene 3.0. Its not working! Obviously, my understanding of the new token filter mechanism is misguided. Can someone in the know help me out for a sec and let me know where I am going wrong. Thanks. impo

Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2010-01-29 Thread Jamie
Hi THere In the absense of documentation, I am trying to convert an EmailFilter class to Lucene 3.0. Its not working! Obviously, my understanding of the new token filter mechanism is misguided. Can someone in the know help me out for a sec and let me know where I am going wrong. Thanks. impo

Re: file open handles?

2010-01-27 Thread Jamie
Hi Jake We got to the bottom of it. Turned out to be a status page that was opening the reader to obtain docCount but not closing it.Thanks for your help! Jamie On 2010/01/27 10:48 AM, Jamie wrote: Hi Jake Ok. The number of file handles left open is increasing rapidly. For instance, 4200

Re: file open handles?

2010-01-27 Thread Jamie
application stops in its track, so this is definitely a critical issue that must be resolved. Jamie On 2010/01/27 10:24 AM, Jake Mannix wrote: On Wed, Jan 27, 2010 at 12:17 AM, Jamie wrote: Oh! Re-reading your initial post - you're just seeing lots of files which haven't quit

Re: file open handles?

2010-01-27 Thread Jamie
Hi Jake You were indexing but not searching? So you are never calling getReader() in the first place? Of course, the call exists, its just that during testing we did not execute any searches at all. How have you been doing search in a realtime fashion with Lucene before 2.9's introduction

Re: file open handles?

2010-01-26 Thread Jamie
place. We have been using Lucene search on a real time basis for years and have not experienced any problems until now. Thanks for the tip on Zoie, but we cannot use any additional frameworks on top of Lucene as Lucene is now deeply integrated into our app. Regards, Jamie On 2010/01/27 08:42

Re: file open handles?

2010-01-26 Thread Jamie
ideas? Jamie On 2010/01/27 02:32 AM, Jason Rutherglen wrote: Jamie, How often are you calling getReader? Is it only these files? Jason On Tue, Jan 26, 2010 at 12:58 PM, Jamie wrote: Ok. I spoke too soon. The problem is not solved. I am still seeing these file handles lying around. Is

Re: file open handles?

2010-01-26 Thread Jamie
/var/index/vol201001/_5q4.cfs (deleted) java 17558 root 903r REG8,1 1294 246663 /var/index/vol201001/_5q5.cfs (deleted) On 2010/01/26 10:09 PM, Jamie wrote: HI Jason Thanks a ton. Problem solved. No more stray file handles! Jamie On 2010/01/26 10:03 PM, Jason

Re: file open handles?

2010-01-26 Thread Jamie
HI Jason Thanks a ton. Problem solved. No more stray file handles! Jamie On 2010/01/26 10:03 PM, Jason Rutherglen wrote: You can call close on the reader obtained via writer.getReader. Well, actually, you'll need to. :) The underlying writer will not be affected though. On Tue, J

Re: file open handles?

2010-01-26 Thread Jamie
Hi Jason No .I wasn't sure whether I needed to or not. We have just switched over to using the the writer.getReader() method and was worried if I closed the Reader that the Writer would be closed too. Is this misguided? Jamie On 2010/01/26 09:40 PM, Jason Rutherglen wrote: Jamie, Ar

Re: file open handles?

2010-01-26 Thread Jamie
simultaneously. Our indexing process keeps the index open at all times and merely commits changes to the index. Obviously, when the server is stopped, the index is closed. When searches take place, we use getReader from the IndexWriter. Should there be a close taking place? Jamie On 2010/01/26 08:54

file open handles?

2010-01-26 Thread Jamie
these file handles still open even though the files are deleted? We are worried about the system running out of file handles over a period of time since the number of open file handles appear to be increasing due to Lucene. Could somebody please illuminate. Thanks Jamie java 5121 root 1649

  1   2   >