date:20100203

Unexpected Query Results

2010-02-03 Thread Jamie

Hi I have some unexpected query results. When attempting two queries: 1) All fields, exact phrase query returns 48 hits (priority:"было время" attach:"было время" score:"было время" size:"было время" sentdate:"было время" archivedate:"было время" receiveddate:"было время" from:"было время" t

Retrieving field information for each hit when using "MultiFieldQueryParser"

2010-02-03 Thread prashant ullegaddi

Hi, I'm using MultiFieldQueryParser to search over different fields of documents in the index. Whenever I get a hit for a query, is it possible to know in which field the query match occurred? And is it possible to retrieve the field(s) for each hit? To make things clearer, suppose I have four fi

Re: Limiting search result for web search engine

2010-02-03 Thread mpolzin

I changed one line below... realized I missed the ! (NOT).. corrected in original reply. if ((hq.Size() < numHits || score >= minScore) && !collectedBaseURLArray.Contains(doc.BaseURL)) { mpolzin wrote: > > > if (score > 0.0f) > { > >

Re: Limiting search result for web search engine

2010-02-03 Thread mpolzin

Hi thanks for the suggestion. I am relatively new to Lucene, so I have a few more questions on this implementation. I looked at the source code for Lucene and found the TopDocCollector class. It appears this class derives from the HitCollector class, so I should be able to simply extend TopDocColl

Re: Limiting search result for web search engine

2010-02-03 Thread mpolzin

Hi thanks for the suggestion. I am relatively new to Lucene, so I have a few more questions on this implementation. I looked at the source code for Lucene and found the TopDocCollector class. It appears this class derives from the HitCollector class, so I should be able to simply extend TopDocColl

Where to download Mark Miller's Qsol Parser?

2010-02-03 Thread Chris Harris

The QSol query parser (brief overview here: http://www.lucidimagination.com/blog/2009/02/22/exploring-query-parsers/) used to be available at http://myhardshadow.com/qsol.php (there was documentation as well as a link to a SVN server) but it looks like the myhardshadow.com has been relinquished t

Match span of capitalized words

2010-02-03 Thread Max Lynch

Hi, I would like to do a search for "Microsoft Windows" as a span, but not match if words before or after "Microsoft Windows" are upper cased. For example, I want this to match: another crash for Microsoft Windows today But not this: another crash for Microsoft Windows Server today Is this possib

Re: Sort memory usage

2010-02-03 Thread Jake Mannix

On Wed, Feb 3, 2010 at 1:33 PM, tsuraan wrote: > > The FieldCache loads per segment, and the NRT reader is reloading only > > new segments from disk, so yes, it's "smarter" about this caching in this > > case. > > Ok, so the cache is tied to the index, and not to any particular > reader. The act

Re: Sort memory usage

2010-02-03 Thread tsuraan

> The FieldCache loads per segment, and the NRT reader is reloading only > new segments from disk, so yes, it's "smarter" about this caching in this > case. Ok, so the cache is tied to the index, and not to any particular reader. The actual FieldCacheImpl keeps a mapping from Reader to its terms,

Re: Sort memory usage

2010-02-03 Thread Jake Mannix

The FieldCache loads per segment, and the NRT reader is reloading only new segments from disk, so yes, it's "smarter" about this caching in this case. -jake On Wed, Feb 3, 2010 at 1:07 PM, tsuraan wrote: > Is the cache used by sorting on strings separated by reader, or is it > a global thing?

Sort memory usage

2010-02-03 Thread tsuraan

Is the cache used by sorting on strings separated by reader, or is it a global thing? I'm trying to use the near-realtime search, and I have a few indices with a million docs apiece. If I'm opening a new reader every minute, am I going to have every term in every sort field read into RAM for each

Re: Sort and Collector

2010-02-03 Thread tsuraan

> It's not really possible. > Lucene must iterate over all of the hits before it knows for sure that > it has the top sorted by any criteria (other than docid). > A Collector is called for every hit as it happens, and thus one can't > specify a sort order (sorting itself is actually implemented wit

Re: Sort and Collector

2010-02-03 Thread Yonik Seeley

On Wed, Feb 3, 2010 at 1:40 PM, tsuraan wrote: > Is there any way to run a search where I provide a Query, a Sort, and > a Collector? I have a case where it is sometimes, but rarely, > necessary to get all the results from a query, but usually I'm > satisfied with a smaller amount. That part I c

Sort and Collector

2010-02-03 Thread tsuraan

Is there any way to run a search where I provide a Query, a Sort, and a Collector? I have a case where it is sometimes, but rarely, necessary to get all the results from a query, but usually I'm satisfied with a smaller amount. That part I can do with just a query and a collector, but I'd like th

Re: Index corruption using Lucene 2.4.1 - thread safety issue?

2010-02-03 Thread Frank Geary

For the record - I haven't proven this yet - but here's my current theory of what is causing the problem: 1) We start with a new RAMDir IW[0] and do some deletes and adds. 2) We create at least one IndexReader based on that IW. The last of which we'll call IndexReader[A]. 3) Then we switch to usi

Re: Limiting search result for web search engine

2010-02-03 Thread Hayri

Mike Polzin wrote: I am working on building a web search engine and I would like to build a reults page similar to what Google does. The functionality I am looking to include is what I refer to a "rolling up" sites, meaning that even if a particular site (defined by its base URL) has many relevent

RE: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

2010-02-03 Thread Uwe Schindler

Just add the field a second time with Field.Store.YES and Field.Index.NO in original case. For searching ad using the Tokenizer approach as described before using the TokenStream. Internally this is handled exactly like this (if you enable both Field.Index.ANALYZED and Field.Store.YES). -

RE: confused by the lucene boolean query with wildcard result

2010-02-03 Thread java8964 java8964

Thanks for you help. I upgrade the lucene to 2.9.1, the problem is gone. It looks like a boolean query bug in the lucene 2.9.0 and fixed in the 2.9.1 Thanks > From: ian@gmail.com > Date: Wed, 3 Feb 2010 10:02:27 + > Subject: Re: confused by the lucene boolean query with wildcard result

RE: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

2010-02-03 Thread java8964 java8964

Thanks for your help. My concern now is that the field could be defined as store. So when the user receive the field data, we want to still show the original data, in upper case in this case. First, I don't think I can use queryParser.SetLowercaseExpandedTerms(false), which will remove the wi

Re: Searching compressed text using CompressionTools

2010-02-03 Thread Ian Lea

Are you saying that by using compression your index size goes up by a factor of more than 1024? From c10 kilobytes to 12 megabytes? Compressing small fields can cause the index to get bigger rather than smaller but obviously not by that much. -- Ian. On Wed, Feb 3, 2010 at 11:01 AM, Suraj Pari

Re: Searching compressed text using CompressionTools

2010-02-03 Thread Suraj Parida

Ian, Small correction made ... Thanks for solving my previous problems. Now i tested the compression with 100 docs and found: 1. Without Compression size of FS directory (on disk)= 10.8 KB 2. With Compression size of FS directory (on disk) = 12.0 MB and with 500 docs: 1. Without Compres

Re: Searching compressed text using CompressionTools

2010-02-03 Thread Suraj Parida

Ian, Thanks for solving my previous problems. Now i tested the compression with 100 docs and found: 1. With Compression size of FS directory (on disk)= 10.8 KB 2. Without Compression size of FS directory (on disk) = 12.0 MB and with 500 docs: 1. With Compression size of FS directory (on

RE: Getting DF & IDF

2010-02-03 Thread Asif Nawaz

In HotelDatabase project of lucene, Following code is written in performSearch method of SearchEngine class. Let queryString = "Located in the heart of paris" Analyzer analyzer = new StandardAnalyzer(); IndexSearcher is = new IndexSearcher("index"); QueryParser parser = new QueryParser("content

Lucene User Group Meetup in Amsterdam

2010-02-03 Thread Uri Boness

Hi All, On 17th February we'll host the first Dutch Lucene User Group Meetup. This meet-up will be split into two parts: - The first part will be dedicated to the user group itself. We'll have an introduction to the members and have an open discussion about the goals of the user group and th

RE: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

2010-02-03 Thread Uwe Schindler

For specific fields using a special TokenStream chain, there is no need to write a separate analyzer. You can add fields to a document using a TokenStream as parameter: new Field(name, TokenStream). As TokenStream just create a chain from Tokenizer and all Filters like: TokenStream ts = new Key

Re: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

2010-02-03 Thread Ian Lea

I think you'll have to write your own. Or just downcase the text yourself first. -- Ian. On Tue, Feb 2, 2010 at 9:30 PM, java8964 java8964 wrote: > > Is there an analyzer like keyword analyzer, but will also lowering the data > from lucene? Or I have to do a customer analyzer by myself? > >

Re: confused by the lucene boolean query with wildcard result

2010-02-03 Thread Ian Lea

You should probably be using your PerFieldAnalyzerWrapper in your calls to QueryParser but apart from that I can't see any obvious reason. General advice: use Luke to check what has been indexed and read http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3

Re: How further reward documents matching more query terms?

2010-02-03 Thread Ian Lea

If you read the javadocs and source for DefaultSimilarity you'll know as much about it as I do, and see what the default is. To customize it, write your own subclass as I said before. -- Ian. On Tue, Feb 2, 2010 at 7:56 PM, Phan The Dai wrote: > Dear Lan Lea, > Thanks much for your reply. > P

Unexpected Query Results

Retrieving field information for each hit when using "MultiFieldQueryParser"

Re: Limiting search result for web search engine

Re: Limiting search result for web search engine

Re: Limiting search result for web search engine

Where to download Mark Miller's Qsol Parser?

Match span of capitalized words

Re: Sort memory usage

Re: Sort memory usage

Re: Sort memory usage

Sort memory usage

Re: Sort and Collector

Re: Sort and Collector

Sort and Collector

Re: Index corruption using Lucene 2.4.1 - thread safety issue?

Re: Limiting search result for web search engine

RE: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

RE: confused by the lucene boolean query with wildcard result

RE: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

Re: Searching compressed text using CompressionTools

Re: Searching compressed text using CompressionTools

Re: Searching compressed text using CompressionTools

RE: Getting DF & IDF

Lucene User Group Meetup in Amsterdam

RE: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

Re: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

Re: confused by the lucene boolean query with wildcard result

Re: How further reward documents matching more query terms?

28 matches

Site Navigation

Mail list logo

Footer information