G1 warming on lucene wiki

2018-09-27 Thread Jeff Courtade
Hello, I have been looking into tuning the garbage collector for solr. I found this entry on the lucene wiki that seems to be out of date. The bug referenced is reported as resolved now. Could someone validate whether it is safe to use G1 garbage collection with lucene? "Do not, under any circum

Unexpected scoring results

2017-07-18 Thread Jeff Wallace
been fixed and/or reduced in later versions (say 5.x or 6.x)? Thank you for any info. Jeff Wallace Software Development, FileNet IBM Corp. 1540 Scenic Ave. Costa Mesa, CA 92626 (714) 327-7163 direct - To unsubscribe, e-mail: java

How about lucene's delete performance ?

2010-10-13 Thread Jeff Zhang
Hi all, I only want to index the latest one week's data, the previous data can be deleted. So I'd like to know about lucene's delete performance and whether it will has impact on the search performance when I do lots of delete operation in the meantime. Thanks -- Best Rega

Re: What is the best practice of using synonymy ?

2010-03-23 Thread Jeff Zhang
t; > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Best Regards Jeff Zhang

What is the best practice of using synonymy ?

2010-03-22 Thread Jeff Zhang
ow which one is better, any help is appreciated. -- Best Regards Jeff Zhang

Re: Scale Out

2010-02-08 Thread Jeff Zhang
, e-mail: java-user-h...@lucene.apache.org > > -- Best Regards Jeff Zhang

RE: Lower/Uppercase problem when searching in a not-analyzed field

2009-12-14 Thread Jeff Plater
h time you won't be able to use wildcard searching (unless you don't care about wildcard searching). -Jeff -Original Message- From: Michel Nadeau [mailto:aka...@gmail.com] Sent: Mon 12/14/2009 4:36 PM To: java-user@lucene.apache.org Subject: Lower/Uppercase problem when searchi

RE: Sort fields shouldn't be tokenized

2009-11-16 Thread Jeff Plater
Thanks - so if my sort field is a single term then I should be ok with using an analyzer (to lowercase it for example). -Jeff -Original Message- From: J.J. Larrea [mailto:j...@panix.com] Sent: Monday, November 16, 2009 11:19 AM To: java-user@lucene.apache.org Subject: Re: Sort fields

Sort fields shouldn't be tokenized

2009-11-16 Thread Jeff Plater
words and such) which can produce an invalid sort order? Thanks. -Jeff - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater
Thanks for the suggestion - I double checked the case and it was OK. Turned out I needed to use the StandardAnalyzer instead of the WhitespaceAnalyzer. -Jeff -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, November 11, 2009 6:52 PM To: java-user

RE: Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater
Thanks - I tried it out and it seems to work for "Philadelphid~0.75 PA" but I can't get it working for "Phil* PA" yet. Perhaps it is an issue with my Analyzer (I am using WhitespaceAnalyzer)?. Have you used it with wildcard before? -Jeff -Original Messag

Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater
omplish this? Right now I am having to hit a look up table to translate the city before searching against the main index - not a fan of this option. Thanks. -Jeff Plater

Re: Distinct terms values? (like in Luke)

2009-05-10 Thread Jeff Turner
ligion" in documents published within a range of dates. Thanks Jeff On May 10, 2009, at 11:35 AM, Uwe Schindler wrote: You can get this list using IndexReader.terms(new Term(fieldname,"")). This returns an enumeration of all terms starting with the given one (the field name). Just

Feasibility question

2008-11-10 Thread Jeff Capone
document and I though I would treat each field as a key word to minimize processing. Assuming you have clusters operating on independent datasets (so I guess it would scale linearly) and you want to process Terabytes of logs per day, is such a solution even feasible? Thank you, Jeff Capone

Query to ignore certain phrases

2008-08-11 Thread Jeff French
We're trying to perform a query where if our intended search term/phrase is part of a specific larger phrase, we want to ignore that particular match, but not the entire document (unless of course there are no other hits with our intended term/phrase). For example, a query like: "white house"

Re: Hit Count per Document

2007-12-20 Thread Jeff
If I am not mistaken, that is for a term.. Is it possible for a query? In the below example, I don't want to know how many times brown is in the document I want to know how many times "quick brown" is in the document. Thanks, Jeff On Dec 20, 2007 3:03 PM, Mark Miller <[EMAIL

Hit Count per Document

2007-12-20 Thread Jeff
slow brown fox jumped over the lazy dog If I searched for "quick brown", is there a way I could see that it was hit 4 times within the document? Thanks, Jeff

Question about Qsol parser and phrase searches

2007-09-09 Thread Jeff French
ere a way to default to no slop, preferrably without changing all of our queries? Thanks for any pointers. Jeff -- View this message in context: http://www.nabble.com/Question-about-Qsol-parser-and-phrase-searches-tf4410480.html#a12582

Lucene and DRBD

2007-08-17 Thread Jeff Gutierrez
a.org/wiki/DRBD Thanks, Jeff - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Nested concept fields

2007-08-12 Thread Jeff French
do something like this (in search pseudocode): sent:(expired num[1 TO 5] "days ago") I don't see how to do this using either Lucene's QueryParser or the QsolParser. Is it possible to do it using the Query API (and the appropriate indexing changes)? Thanks for any pointers.

Re: Nested Fields

2007-08-09 Thread Jeff French
od to the buffer for each parent element. Then I removed the current element and added its content as a Field. I should add that I am also fairly new to Lucene, so just because I did it that way doesn't mean it's the best or even a good way. Jeff Spencer Tickner wrote: > &

Re: Search terms on a single "instance" of field

2007-07-29 Thread Jeff French
rmB"~99) I did this playing around with table cells, and it seems to work so far. Jeff rossini wrote: > > Actually no, > >Because I'd like to retrieve terms that were computed on the same > instance of Field. Taking your example to ilustrate better, I have 2 >

replace values in index

2007-07-12 Thread Jeff
eperator. Is there an easy way to add ',' as a token seperator? Thanks, -Jeff

RE: ways to minimize index size?

2007-03-14 Thread Jeff
I found that reducing my index from 8G to 4G (through not stemming) gave me about a 10% performance improvement. How did you do this? I don't see this as an option. Jeff

RE: Q: Highlighter + Search symbols "*, ?, ~"

2006-11-21 Thread Storey, Jeff
Thanks for the quick reply. I'll be implementing this in the next couple of days. Appreciate it! Jeff -Original Message- From: Stephan Spat [mailto:[EMAIL PROTECTED] Sent: Monday, November 20, 2006 8:43 AM To: java-user@lucene.apache.org Subject: Re: Q: Highlighter + Search sy

RE: Partial Word Matches

2006-11-11 Thread Storey, Jeff
Erick, Very useful answers -- I'll be reading up more with the links you've provided. Thanks. Jeff -Original Message- From: Erick Erickson [mailto:[EMAIL PROTECTED] Sent: Saturday, November 11, 2006 5:51 PM To: java-user@lucene.apache.org Subject: Re: Partial Word Matches

RE: Partial Word Matches

2006-11-11 Thread Storey, Jeff
arch for the term "yellow~" I might get something like "bellow." Is there a way to list what Lucene found in the document that made it relevant? Thanks for all the help. Jeff -Original Message- From: Paul Borgermans [mailto:[EMAIL PROTECTED] Sent: Saturday, November 11,

RE: Partial Word Matches

2006-11-11 Thread Storey, Jeff
IndexSearcher to search the parsed query created in Step 3. That's it. Is this the proper way to be doing searching? Thanks. Jeff -Original Message- From: Paul Borgermans [mailto:[EMAIL PROTECTED] Sent: Saturday, November 11, 2006 3:06 PM To: java-user@lucene.apache.org Subject: Re: Partial

Partial Word Matches

2006-11-11 Thread Storey, Jeff
Hi. I'm using Lucene to do some searching (using the Searcher object and passing it a ParsedQuery). I search for a word such as "long" and it is returning partial matches, such as "belong" and "along." Is there a way to turn off this behavior and only match whole words? Thank you, Jeff

Re: Query question

2006-11-05 Thread jeff . richley
ueryParser to build your queries for you, use the KeywordAnalyzer > to > : > make sure no lowercasing or stemming takes place. > : > 2) OMIT_NORMs when indexing .. they only matter if you want the > lengths > : > of fields to affect the score, and you don't -- you only want t

Re: Query question

2006-11-04 Thread jeff . richley
;, "/a/b/c", Field.Store.YES, Field.Index.UN_TOKENIZED); document.add(location); Field name = new Field("name", "Jeff Richley", Field.Store.YES,

Re: Query question

2006-11-04 Thread jeff . richley
help would be greatly appreciated. > > : 1.) I have data like name="Jeff" lastname="Richley" age="33" and I need > to > : be able to query by any combination such as name="Jeff" age="33". But > if > : I query with name=&qu

Re: Query question

2006-11-02 Thread jeff . richley
Ah good question. The data that I am needing to query on is not a set definition of tables or columns like a database is. Let me give two examples: 1.) I have data like name="Jeff" lastname="Richley" age="33" and I need to be able to query by any combination such

Query question

2006-11-02 Thread jeff . richley
I am wanting to be able to put sets of data in a very structured way and query Lucene for only 100% matches. Is there a way to do this? I seem to be getting back at best 0.30685282. I appreciate any help and insite. Jeff Richley, Vice President Southeast Virginia Java Users Group [EMAIL

Re: 30 milllion+ docs on a single server

2006-08-13 Thread Jeff Rodenburg
On 8/12/06, Mark Miller <[EMAIL PROTECTED]> wrote: The single server is important because I think it will take a lot of work to scale it to multiple servers. The index must allow for close to real-time updates and additions. It must also remain searchable at all times (other than than during the

Re: 30 milllion+ docs on a single server

2006-08-12 Thread Jeff Rodenburg
Why is a single server so important? I can scale horizontally much cheaper than I scale vertically. On 8/11/06, Mark Miller <[EMAIL PROTECTED]> wrote: I've made a nice little archive application with lucene. I made it to handle our largest need: 2.5 million docs or so on a single server. Now

Re: Distributed Search

2006-07-27 Thread Jeff Rodenburg
Hi Mark - Having gone down this path for the past year, I echo comments from others that scalability/availability/failover is a lot of work. We migrated away from a custom system based on Lucene running on Windows to Solr running on Linux. It took us 6 months to get our system to a solid five-n

Architecture for indexing/searching mailing list archives

2006-07-24 Thread Jeff Schnitzer
but I would like to understand the bounds of the problem a bit better. Any advice? Thanks, Jeff Schnitzer SubEtha Mailing List Manager - http://subetha.tigris.org/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: Nutch- Better than Lucene?

2006-07-07 Thread Wang, Jeff
Heh, you said it better than I. I was just about to reply with the witty "Nutch is Lucene, isn't it?" Jeff -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Friday, July 07, 2006 10:28 AM To: java-user@lucene.apache.org Subject: Re: Nutch- Bet

RE: Lock File

2006-06-29 Thread Wang, Jeff
I have a clustered environment, with a load-balancer in the front assigning connections. Is it better to have one of the cluster running a searcher as a webservice (to be accessed by the other machines in the cluster) or to have a IndexReader/Searcher for each machine in the cluster? Jeff

RE: search performance benchmarks

2006-06-26 Thread Wang, Jeff
3.6Ghz I think.) I frankly haven't tested out scalability yet. Jeff Emptoris, Inc. -Original Message- From: Vladimir Olenin [mailto:[EMAIL PROTECTED] Sent: Monday, June 26, 2006 7:56 AM To: java-user@lucene.apache.org Subject: search performance benchmarks Hi, I'm evaluat

Re: Analyzer question

2006-05-19 Thread Jeff Rodenburg
The Keyword analyzer does no stemming or input modification of any sort: think of it as WYSIWYG for index population. The Whitespace analyzer simply removes spaces from your input (still no stemming), but the tokens are the individual words. I don't have the code in front of me, so I'm not sure

Re: Backing up indexes, reliability and robustness

2006-05-12 Thread Jeff Rodenburg
Marc - We built our index maintenance operation to assume a breakdown would occur in process (because it happened several times.) We exist in an environment where "always on, always available" is a business requirement. We also do a lot of updates on a cyclical basis (every 10 minutes), so malf

Re: Why is BooleanQuery.maxClauseCount static?

2006-04-15 Thread Jeff Rodenburg
y can sometimes cause problems when both types of queries need to execute simultaneously. -- j On 4/15/06, Paul Elschot <[EMAIL PROTECTED]> wrote: > > On Saturday 15 April 2006 18:20, Jeff Rodenburg wrote: > > What was the thinking behind making the BooleanQuery maxClauseCount a > &

Why is BooleanQuery.maxClauseCount static?

2006-04-15 Thread Jeff Rodenburg
that use a high number of clauses, but another set that needs a low number of clauses (different indexes searched, and efficiencies dictate the high/low clause range.) cheers, jeff

Re: Speed up Indexing

2006-03-23 Thread Jeff Rodenburg
I run Lucene.Net as well, and your indexing performance is dependent on more factors aside from whether you're using the Java or C# version. As a basic suggestion, learn what you can about minMergeDocs and mergeFactor as well as the compound file format. Try different combinations to understand w

Business stop words?

2006-03-16 Thread Jeff Rodenburg
Does anyone have a lead on "business" stop words? Things like "inc", "llc", "md", etc. I'd rather not reinvent this wheel. :-) cheers, jeff

Index validation utility

2006-03-11 Thread Jeff Rodenburg
data types, etc. I'm working on this mostly for myself, but if anyone is interested just send me an email off-list. cheers, -- jeff r.

Re: Question

2006-03-07 Thread Jeff Rodenburg
We've done this, and it's not that complex. (Sorry, client won't allow me to release the code.) It's AJAX on the front end, so that background call is simply executing a search against an index that consists of the aggregated search terms. We do wildcard queries to get the results we want. For u

Re: Search on many indexes at once

2006-03-03 Thread Jeff Rodenburg
Raul - You'll want to look at the MultiSearcher and ParallelMultiSearcher classes for this. On 3/3/06, Raul Raja Martinez <[EMAIL PROTECTED]> wrote: > > Is it possible to search many indexes in one query and get back the Hits > ordered by relevance? > > Can someone point me out to some document o

Re: Hacking proximity search: looking for feedback

2006-03-01 Thread Jeff Rodenburg
Very good note, I missed that. I need the development environment in front of me to remember all the different class names correctly. ;-) -- j On 3/1/06, Doug Cutting <[EMAIL PROTECTED]> wrote: > > Jeff Rodenburg wrote: > > Following on the Range Query approach, how is per

Re: Hacking proximity search: looking for feedback

2006-03-01 Thread Jeff Rodenburg
FunctionQueries to influence your scores based on distance fro mthe > center of hte box. > > : > : Great feedback, thanks for the notes. > : > : -- jeff > : > : On 2/28/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: > : > > : > > : > : Geo d

Re: Hacking proximity search: looking for feedback

2006-02-28 Thread Jeff Rodenburg
the notes. -- jeff On 2/28/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > > : Geo definition: > : Boxing around a center point. It's not critical to do a radius search > with > : a given circle. A boxed approach allows for taller or wider frames of > : reference

Re: Hacking proximity search: looking for feedback

2006-02-28 Thread Jeff Rodenburg
component of relevance? We have a need for distance sorting, but I'm trying to slay that beast at a later stage. -- jeff On 2/28/06, Bryzek.Michael <[EMAIL PROTECTED]> wrote: > > Jeff - > > This is an interesting approach. On our end, we have experimented with > two va

Re: Hacking proximity search: looking for feedback

2006-02-28 Thread Jeff Rodenburg
el [mailto:[EMAIL PROTECTED] > Sent: Tuesday, February 28, 2006 2:49 PM > To: java-user@lucene.apache.org > Subject: RE: Hacking proximity search: looking for feedback > > Jeff - > > This is an interesting approach. On our end, we have experimented with > two variants: > &g

Hacking proximity search: looking for feedback

2006-02-28 Thread Jeff Rodenburg
ted approximately 145 clauses within the final constructed query. In validation testing, this approach has proven to be: 1) Accurate. 2) Performant (thus far). At last, my question to everyone who cares to respond (and read this far): feedback? Thanks, -- jeff

RE: Inappropriate content detection

2006-02-06 Thread Jeff Thorne
The site will have million+ posts. I am not familiar with Bayesian algorithms. Is there an off the shelf API that can provide this type of capability. As for performance would Bayesian be the way to go over Lucene? Thanks for the help, Jeff -Original Message- From: gekkokid [mailto

Re: Inappropriate content detection

2006-02-05 Thread Jeff Rodenburg
You can generate a token stream for a block of text without having to index it. Take a look at the highlighter code, it does this very thing. On 2/5/06, Jeff Thorne <[EMAIL PROTECTED]> wrote: > > I am trying to figure out whether or not Lucene is an appropriate solution > for a p

Inappropriate content detection

2006-02-05 Thread Jeff Thorne
to tackle this problem with Lucene or another api if doing so makes more sense? Thanks, Jeff

Re: How do I send search query to Multiple search Indexes ?

2006-02-02 Thread Jeff Rodenburg
Vikas - Start with the RemoteSearchable class. Technology will be RMI. Hope this helps. On 2/2/06, Vikas Khengare <[EMAIL PROTECTED]> wrote: > > Hi Friends > > How do I send one search query to multiple search Indexes which are > on remote machines ? > > Which Technology will help me (A

Re: Help with indexing and query strategy

2006-01-30 Thread Jeff Rodenburg
Have you considered evaluating doc-score thresholds for limiting your results? Since the perfect answers to these situations lie in the constant tweaking and twiddling of analysis and tokenization, one way I've found to help is to evaluate result scores. In your "Ontario CA" example, limiting res

Re: deleting duplicate documents from my index

2006-01-29 Thread Jeff Rodenburg
One way to do this (depending on your system and index size) is to remove and add every url you find. This would ensure that every document in the index is unique. No need to worry about sorting and iteration and doc_ids and the like. It rebuilds your entire index, but if you have a duplication

Lucene and geo queries

2006-01-04 Thread Jeff Rodenburg
I'm very interested in incorporating smart geographic querying capabilities (distance calcs are just scratching the surface) into Lucene and came across this whitepaper: http://www.clef-campaign.org/2005/working_notes/workingnotes2005/leidner05.pdf Just curious, has anyone ventured down this path

RE: best strategy to deal with large index file

2005-12-16 Thread Jeff Liang
field that should retrieve a lot of records, it normally throws the exception. I will look at MultiSearcher. do you think split the index file based on date field is a good choice? I somehow feel it requires a lot of coding to create many indexes based on date field. Thanks,

best strategy to deal with large index file

2005-12-16 Thread Jeff Liang
index file? I start jvm with 800MB. thanks, Jeff

Re: ApacheCon next week

2005-12-12 Thread Jeff Rodenburg
Well done, Grant. Very informative. Question on Term Vectors: with their inclusion in an index, have you noticed any degradation in performance, either from a search effiiciency or maintenance point-of-view? Given the power of term vectors, if the perf impact is negligible, I'm curious to the re

Re: How to do refined search based on attributes and never return zero results

2005-12-07 Thread Jeff Rodenburg
Check out Chris Hostetter's methodology for doing this at cnet. http://mail-archives.apache.org/mod_mbox/lucene-java-user/200508.mbox/[EMAIL PROTECTED] This sounds like it matches your requirements. cheers, j On 12/7/05, Ching-Pei Hsing <[EMAIL PROTECTED]> wrote: > > Has anyway solved the foll

Re: Distributed sort

2005-12-04 Thread Jeff Rodenburg
thanks Erik On 12/3/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On Dec 3, 2005, at 1:26 PM, Jeff Rodenburg wrote: > > > In one of the Google Labs whitepapers ( > > http://labs.google.com/papers/mapreduce-osdi04.pdf), a programming > > construct > >

Distributed sort

2005-12-03 Thread Jeff Rodenburg
In one of the Google Labs whitepapers ( http://labs.google.com/papers/mapreduce-osdi04.pdf), a programming construct known as MapReduce is used in a variety of jobs/tasks within Google's operation. As an example of the application of MapReduce, the whitepaper refers to Distributed Sorting. Essent

Re: lucene and database searching, keeping score

2005-12-02 Thread Jeff Rodenburg
George - There are a number of SQL Server specific ways you can do this. Email me off-list as the solution is not relevant to Lucene. -- j On 12/2/05, George Abraham <[EMAIL PROTECTED]> wrote: > > All, > I have created a Lucene index from data in a SQL Server db. When I conduct > a > Lucene sea

Re: A couple of questions regarding load balancing and failover

2005-11-30 Thread Jeff Rodenburg
On 11/30/05, Daniel Pfeifer <[EMAIL PROTECTED]> wrote: > > > 1.) Does Lucenes MultiSearcher implement some kind of automatic failover > and/or load-balancing mechanism if both Searchables which I supply in > MultiSearchers constructor go to two different servers but to the very same > index-files?

Re: High CPU utilization with sort

2005-11-20 Thread Jeff Rodenburg
(especially for numeric fields). > > If you haven't already, you should compare the query times of a > "warmed" searcher. Sorted queries will still take longer, but I > haven't measured how much longer. > > -Yonik > Now hiring -- http://forms.cnet.com/slink?

High CPU utilization with sort

2005-11-20 Thread Jeff Rodenburg
27;ve seen performance in terms of requests/second drop by a factor of 10, compared to similar tests executing only search requests (no sorts). CPU appears to be our bottleneck, and I'm trying to determine if this is expected behavior or if we're outside the bounds of typical performance. Thanks, jeff

Re: Items in multiple category: distinct search?

2005-11-15 Thread Jeff Rodenburg
Hi John - It sounds like you're thinking of your index in terms of sql constructs -- multiple rows for the same record. We do this very same thing with categories; if you have a record that lives in multiple categories, just add additional category field/value pairs for your original record. It's

Re: Help with Search Java Code set up

2005-10-26 Thread Jeff Rodenburg
Kevin - Maybe I'm misunderstanding, but how is this not a BooleanQuery with two clauses? - j On 10/26/05, Kevin L. Cobb <[EMAIL PROTECTED]> wrote: > > I've been using Lucene happily for a couple of years now. But, this new > search functionality I'm trying to add is somewhat different that what

Re: MaxFieldLength or MaxFields?

2005-10-26 Thread Jeff Rodenburg
thanks Erik On 10/26/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On 26 Oct 2005, at 02:50, Jeff Rodenburg wrote: > > I'm considering building out an index that will flatten a data > > structure, > > such that some Document "A" will have

MaxFieldLength or MaxFields?

2005-10-25 Thread Jeff Rodenburg
I'm considering building out an index that will flatten a data structure, such that some Document "A" will have Fields 1,2 and 3. Fields 1 and 2 are indexed/tokenized field. Field 3 is indexed, and will contain many discrete values (up to possibly 5000). Couple of questions: 1. Does the DEFAULT_MA

Re: Using analyzers with term queries

2005-10-25 Thread Jeff Rodenburg
I don't mean to take the thread off-topic, but is this the recommended approach for any of the Query objects, i.e. SpanQuery or PhraseQuery? On 10/25/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On 25 Oct 2005, at 07:00, Rob Young wrote: > > I am using TermQuery s (and FuzzyQuery s) on the s

Re: Classifier4J and Lucene

2005-10-23 Thread Jeff Rodenburg
Sounds like you might have to consider both, if the first one doesn't solve your issue. A company field sounds like it's a single entry, i.e. one that can't be "spammed up" with multiple terms, i.e. "Oralce Oracle Oracle". It also sounds as if you're searching multiple fields, and that some fields

Re: Improving sort performance

2005-10-22 Thread Jeff Rodenburg
s of the > query to 0. > > So, (MyQuery, sorted by MyFunkySort), becomes > ((+MyQuery^0 MyFunctionQuery), sorted by score) > > -Yonik > Now hiring -- http://forms.cnet.com/slink?231706 > > On 10/22/05, Jeff Rodenburg <[EMAIL PROTECTED]> wrote: > > > > This

Re: Improving sort performance

2005-10-22 Thread Jeff Rodenburg
type of score you are trying to do, but maybe > FunctionQuery would help. > http://issues.apache.org/jira/browse/LUCENE-446 > > -Yonik > Now hiring -- http://forms.cnet.com/slink?231706 > > On 10/22/05, Jeff Rodenburg <[EMAIL PROTECTED]> wrote: > > > > I have

Improving sort performance

2005-10-22 Thread Jeff Rodenburg
ate index seems wasteful in this scenario, given the relative number of results to the overall size of the index. What are my options here? Thanks jeff

Re: RemoteSearchable woes

2005-10-12 Thread Jeff Rodenburg
I'll take the no-response as a "no". :-) On 10/11/05, Jeff Rodenburg <[EMAIL PROTECTED]> wrote: > > Anyone running RemoteSearchable? I'm on v1.4.3 and am using it just fine, > until I need to: > > 1) use a custom sort, or > 2) use something that ext

RemoteSearchable woes

2005-10-12 Thread Jeff Rodenburg
like these and found a crafty way to solve it? Thoughts, comments, suggestions? - jeff r.

Hitcollectors and remotesearchables

2005-10-10 Thread Jeff Rodenburg
so need to pass in a *HitCollector* implementation that subclasses UnicastRemoteObject, so that the callbacks can return to the original VM. So, if you can, it's considerably simpler and more efficient to use TopDocs-based search when you're working remotely." Is this still consider

Custom sort with multiple fields?

2005-10-09 Thread Jeff Rodenburg
lit them out in a string[] similar to the LIA example? cheers, jeff r.

Re: RemoteSearchable and sorting

2005-10-08 Thread Jeff Rodenburg
p the exceptions appropriately. -- j On 10/5/05, Rasik Pandey <[EMAIL PROTECTED]> wrote: > > Hi Jeff, > > Sorting needs access to an IndexReader so it can do Term lookups, and > I don't think there is a remote impl of IndexReader probably because, > among other reasons

Re: RemoteSearchable and sorting

2005-10-05 Thread Jeff Rodenburg
Thanks Rasik. If this is the case, why is this exposed in the API? Should the overloaded search method on ParallelMultiSearcher that takes a Sort object be removed? I'm using the 1.4.3 codebase. -j On 10/5/05, Rasik Pandey <[EMAIL PROTECTED]> wrote: > > Hi Jeff, > > Sor

RemoteSearchable and sorting

2005-10-05 Thread Jeff Rodenburg
Are there known limitations or issues with sorting and RemoteSearchable? I'm encountering problems attempting to sort through a MultiSearcher (ParallelMultiSearcher, actually). I'm using an array of RemoteSearchable objects as the Searchable[] source. If I change the source indexes to be local Inde

Suggestions for analysis

2005-09-21 Thread Jeff Rodenburg
ple, a search for "Wedgewood WA" would ideally not match "Wedgewood GA". I'm starting with the StandardAnalyzer and thinking of possibly extending it to carry in some of the business rules meant to come into play for tie-breakers. Comments appreciated. Thanks, jeff r.

Re: Sort by relevance+distance

2005-09-19 Thread Jeff Rodenburg
This is interesting, one I had not considered. Mark - are there any code samples that implement this approach? Or maybe something similar in approach? thanks, jeff On 9/19/05, mark harwood <[EMAIL PROTECTED]> wrote: > > I think the HitCollector approach was fine but needed &

Re: Sort by relevance+distance

2005-09-18 Thread Jeff Rodenburg
I like Erik's suggestion here as a starting point. I would guess you might find some direction in the Scorer class, but I haven't gone through this in detail. Conceptually a sliding weight based on proximity sounds correct... -- jeff On Sep 18, 2005, at 3:39 PM, James Huang wrote:

Re: Is Lucene right for my app?

2005-09-18 Thread Jeff Rodenburg
plenty of support on this mailing list, but you can educate yourself much more effectively with that book. The authors lurk on this list. It's the cheapest consulting ($40) you can get. Cheers, jeff On 9/18/05, Kevin Stembridge <[EMAIL PROTECTED]> wrote: > > > Would Lucene

Re: Sort by relevance+distance

2005-09-18 Thread Jeff Rodenburg
trimming the post further: On 9/18/05, James Huang <[EMAIL PROTECTED]> wrote: > > >The problem is quite generic, I believe. What I like to do is similar to > LIA-ch6, i.e. to find a "good Chinese Hunan-style restaurant near me." I > prefer Hunan-style; however, if a good Human-style one is 12 m

Re: Stopping Duplicates

2005-09-17 Thread Jeff Rodenburg
indexed. This is an operational question, so the *best* way depends on your overall operation, as both of these approaches have consequences on index maintenance operations. Hope this helps. -- jeff On 9/17/05, Ben Gill <[EMAIL PROTECTED]> wrote: > > Hi, > > I am storing

Re: Hits issue or custom filter issue?

2005-09-14 Thread Jeff Rodenburg
Good call, Chris.I followed the BitSet comparison route and found that the custom filter was working exactly as it should, but *I* wasn't passing it correct data. Rookie mistake. Doh! I hate it when that happens. -- j On 9/13/05, Jeff Rodenburg <[EMAIL PROTECTED]> wrote: >

Re: Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg
uals() then there's your problem. Will do the step-through following this manner and post the results. -- j : Date: Tue, 13 Sep 2005 17:22:49 -0700 > : From: Jeff Rodenburg <[EMAIL PROTECTED]> > : Reply-To: java-user@lucene.apache.org, [EMAIL PROTECTED] > : To: Chris Hoste

Re: Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg
Might be the same issue, haven't been able to determine during a step-through on the code exec. You're right, no need to add a new FilteredQuery to the statement, just a search on combinedQuery with a new myCustomFilter. Unfortunately, no joy; same response. -- j On 9/13/05, Chris Hostetter <[E

Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg
nitial thought is the problem lies in the custom filter I've created. myCustomFilter extends Filter, and I'm following the BitSet comparitive example as found in the LIA book. I've done nothing in myCustomFilter regarding caching. I'm doubting this is a bug, but rather something I've overlooked. thanks, jeff r.

Is George Aroush still around?

2005-09-13 Thread Jeff Rodenburg
Mayday, mayday Has anyone had recent contact with George Aroush? He's presently managing the C# port of Lucene. Thanks, Jeff Rodenburg

  1   2   >