For a two word wildcard query like name:john s* , can I avoid expanding s* for the entire name field?

2010-02-04 Thread Melissa Hao
Hi, I am wondering about wildcard queries that are more than one word, such as: name:john s* Note: All terms are required (default boolean operator is AND). I know that for the query name:s* The s* is expanded over all s* terms in the name field. For the "john s*" case, is it possibl

Re: Analyzer for stripping non alpha-numeric characters?

2010-02-04 Thread Jason Rutherglen
Answering my own question... PatternReplaceFilter doesn't output multiple tokens... Which means messing with capture state... On Thu, Feb 4, 2010 at 2:16 PM, Jason Rutherglen wrote: > Transferred partially to solr-user... > > Steven, thanks for the reply! > > I wonder if PatternReplaceFilter can

Re: Analyzer for stripping non alpha-numeric characters?

2010-02-04 Thread Jason Rutherglen
Transferred partially to solr-user... Steven, thanks for the reply! I wonder if PatternReplaceFilter can output multiple tokens? I'd like to progressively strip the non-alphanums, for example output: apple!&* apple!& apple! apple On Thu, Feb 4, 2010 at 12:18 PM, Steven A Rowe wrote: > Hi Jaso

RE: Unexpected Query Results

2010-02-04 Thread Steven A Rowe
On 02/04/2010 at 3:24 PM, Chris Hostetter wrote: > : Since phrase query terms aren't analyzed, you're getting exact > : matches > > quoted phrase passed to the QueryParser are analyzed -- but they are > analyzed as complete strings, so Analyzers that treat whitespace > special may produce differne

RE: Unexpected Query Results

2010-02-04 Thread Chris Hostetter
: Since phrase query terms aren't analyzed, you're getting exact matches quoted phrase passed to the QueryParser are analyzed -- but they are analyzed as complete strings, so Analyzers that treat whitespace special may produce differnet Terms then if the individual "words" were analyzed indiv

RE: Analyzer for stripping non alpha-numeric characters?

2010-02-04 Thread Steven A Rowe
Hi Jason, Solr's PatternReplaceFilter(ts, "\\P{Alnum}+$", "", false) should work, chained after an appropriate tokenizer. Steve On 02/04/2010 at 12:18 PM, Jason Rutherglen wrote: > Is there an analyzer that easily strips non alpha-numeric from the end > of a token? > >

Re: Limiting search result for web search engine

2010-02-04 Thread Ian Lea
Mike Documents do not get passed to Collectors in order of highest score. It is the job of the collector to gather the top scoring docs, as is typically required, and implemented by TopScoreDocCollector for the most commonly used search method calls (according to the javadocs - read the javadocs!

[ANNOUNCE] Katta 0.6 released

2010-02-04 Thread Johannes Zillmann
Release 0.6 of Katta is now available. Katta - Lucene (or Hadoop Mapfiles or any content which can be split into shards) in the cloud. http://katta.sourceforge.net The key changes of the 0.6 release among dozens of bug fixes: - upgrade lucene to 3.0 - upgrade zookeeper to 3.2.2 - upgrade hadoop

RE: Unexpected Query Results

2010-02-04 Thread Steven A Rowe
Hi Jamie, Since phrase query terms aren't analyzed, you're getting exact matches for terms "было" and "время", but when you search for them individually, they are analyzed, and it is the analyzed query terms that fail to match against the indexed terms. Sounds to me like your index-time and qu

Re: Limiting search result for web search engine

2010-02-04 Thread mpolzin
Ian, Yes, this makes sense, my guess is that by creating a custom collector and in my overridden Collect method looking up each document by the docid to get the base URL is going to create a fairly significant performance hit. And from the sounds of your response there is no guarantee that the d

Analyzer for stripping non alpha-numeric characters?

2010-02-04 Thread Jason Rutherglen
Is there an analyzer that easily strips non alpha-numeric from the end of a token? - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Where to download Mark Miller's Qsol Parser?

2010-02-04 Thread Mark Miller
Chris Harris wrote: > The QSol query parser (brief overview here: > http://www.lucidimagination.com/blog/2009/02/22/exploring-query-parsers/) > used to be available at > > http://myhardshadow.com/qsol.php > > (there was documentation as well as a link to a SVN server) but it > looks like the myhard

Re: org.apache.lucene.util.English

2010-02-04 Thread Simon Willnauer
Suraj, TestSpellChecker and English are part of the lucene test sources. If you download the source distribution or checkout the 2.9.1 tag you will be able to set it up in your IDE to look at the tests, compile and run them. In 3.1 English will be part of the core jar afaik. simon On Thu, Feb 4

Re: org.apache.lucene.util.English

2010-02-04 Thread Suraj Parida
Simon, Do you mean org.apache.lucene.util.English is not a part of 2.9.1 ? Actually i was trying to add spell checker in my application search. I referred it to know how to use the same. As it is difficult to find examples so I thought it is the best place to see sample codes :) Re

Re: org.apache.lucene.util.English

2010-02-04 Thread Simon Willnauer
You are referring to a testcase which is not included in a any artifact. What are you trying to do with this class? simon On Thu, Feb 4, 2010 at 12:48 PM, Suraj Parida wrote: > > Hi, > > > lucene-2.9.1-src\lucene-2.9.1\contrib\spellchecker\src\test\org\apache\lucene\search\spell > >         has

org.apache.lucene.util.English

2010-02-04 Thread Suraj Parida
Hi, lucene-2.9.1-src\lucene-2.9.1\contrib\spellchecker\src\test\org\apache\lucene\search\spell has a file TestSpellChecker.java Please tell which jar file is used in it. i can't find the jar. Regards Suraj -- View this message in context: http://old.nabble.com/org.apache.lucen

Re: Unexpected Query Results

2010-02-04 Thread Ian Lea
There is no pseudo field for all search terms. 2 common practices are to use MultiFieldQueryParser or to add a catch-all field. I tend to do the latter. At a glance I'd agree that the second query should also return 48 hits. Maybe a small self-contained test case or standalone program would be

Getting an incorrect spatial search - lucene 2.9.1 and 3.0

2010-02-04 Thread Julian Atkinson
Hi everyone, I've been using lucene spatial for the last few months without noticing any particular issues with the results...until now. I'm posting 2 unit tests to demonstrate the issue - the first based on 2.9.1 and the other in 3.0 Could be I'm missing something obvious and would appreciate a

Re: Limiting search result for web search engine

2010-02-04 Thread Ian Lea
Writing a custom collector is pretty straightforward. There is an example in the javadocs for Collector. Use it via Searcher.search(query, collector) or search(query, filter, collector). The docid is passed to the collect() method and you can use that to get at the document and thus the URL, via