Hi
I have some unexpected query results.
When attempting two queries:
1) All fields, exact phrase query returns 48 hits
(priority:"было время" attach:"было время" score:"было время" size:"было
время" sentdate:"было время" archivedate:"было время" receiveddate:"было
время" from:"было время" t
Hi,
I'm using MultiFieldQueryParser to search over different fields of documents
in the
index. Whenever I get a hit for a query, is it possible to know in which
field the query
match occurred? And is it possible to retrieve the field(s) for each hit?
To make things clearer, suppose I have four fi
I changed one line below... realized I missed the ! (NOT).. corrected in
original reply.
if ((hq.Size() < numHits || score >= minScore) &&
!collectedBaseURLArray.Contains(doc.BaseURL))
{
mpolzin wrote:
>
>
> if (score > 0.0f)
> {
>
>
Hi thanks for the suggestion. I am relatively new to Lucene, so I have a few
more questions on this implementation. I looked at the source code for
Lucene and found the TopDocCollector class. It appears this class derives
from the HitCollector class, so I should be able to simply extend
TopDocColl
Hi thanks for the suggestion. I am relatively new to Lucene, so I have a few
more questions on this implementation. I looked at the source code for
Lucene and found the TopDocCollector class. It appears this class derives
from the HitCollector class, so I should be able to simply extend
TopDocColl
The QSol query parser (brief overview here:
http://www.lucidimagination.com/blog/2009/02/22/exploring-query-parsers/)
used to be available at
http://myhardshadow.com/qsol.php
(there was documentation as well as a link to a SVN server) but it
looks like the myhardshadow.com has been relinquished t
Hi,
I would like to do a search for "Microsoft Windows" as a span, but not match
if words before or after "Microsoft Windows" are upper cased.
For example, I want this to match: another crash for Microsoft Windows today
But not this: another crash for Microsoft Windows Server today
Is this possib
On Wed, Feb 3, 2010 at 1:33 PM, tsuraan wrote:
> > The FieldCache loads per segment, and the NRT reader is reloading only
> > new segments from disk, so yes, it's "smarter" about this caching in this
> > case.
>
> Ok, so the cache is tied to the index, and not to any particular
> reader. The act
> The FieldCache loads per segment, and the NRT reader is reloading only
> new segments from disk, so yes, it's "smarter" about this caching in this
> case.
Ok, so the cache is tied to the index, and not to any particular
reader. The actual FieldCacheImpl keeps a mapping from Reader to its
terms,
The FieldCache loads per segment, and the NRT reader is reloading only
new segments from disk, so yes, it's "smarter" about this caching in this
case.
-jake
On Wed, Feb 3, 2010 at 1:07 PM, tsuraan wrote:
> Is the cache used by sorting on strings separated by reader, or is it
> a global thing?
Is the cache used by sorting on strings separated by reader, or is it
a global thing? I'm trying to use the near-realtime search, and I
have a few indices with a million docs apiece. If I'm opening a new
reader every minute, am I going to have every term in every sort field
read into RAM for each
> It's not really possible.
> Lucene must iterate over all of the hits before it knows for sure that
> it has the top sorted by any criteria (other than docid).
> A Collector is called for every hit as it happens, and thus one can't
> specify a sort order (sorting itself is actually implemented wit
On Wed, Feb 3, 2010 at 1:40 PM, tsuraan wrote:
> Is there any way to run a search where I provide a Query, a Sort, and
> a Collector? I have a case where it is sometimes, but rarely,
> necessary to get all the results from a query, but usually I'm
> satisfied with a smaller amount. That part I c
Is there any way to run a search where I provide a Query, a Sort, and
a Collector? I have a case where it is sometimes, but rarely,
necessary to get all the results from a query, but usually I'm
satisfied with a smaller amount. That part I can do with just a query
and a collector, but I'd like th
For the record - I haven't proven this yet - but here's my current theory of
what is causing the problem:
1) We start with a new RAMDir IW[0] and do some deletes and adds.
2) We create at least one IndexReader based on that IW. The last of which
we'll call IndexReader[A].
3) Then we switch to usi
Mike Polzin wrote:
I am working on building a web search engine and I would like to build a reults page similar to what Google does. The functionality I am looking to include is what I refer to a "rolling up" sites, meaning that even if a particular site (defined by its base URL) has many relevent
Just add the field a second time with Field.Store.YES and Field.Index.NO in
original case. For searching ad using the Tokenizer approach as described
before using the TokenStream.
Internally this is handled exactly like this (if you enable both
Field.Index.ANALYZED and Field.Store.YES).
-
Thanks for you help.
I upgrade the lucene to 2.9.1, the problem is gone. It looks like a boolean
query bug in the lucene 2.9.0 and fixed in the 2.9.1
Thanks
> From: ian@gmail.com
> Date: Wed, 3 Feb 2010 10:02:27 +
> Subject: Re: confused by the lucene boolean query with wildcard result
Thanks for your help.
My concern now is that the field could be defined as store. So when the user
receive the field data, we want to still show the original data, in upper case
in this case.
First, I don't think I can use queryParser.SetLowercaseExpandedTerms(false),
which will remove the wi
Are you saying that by using compression your index size goes up by a
factor of more than 1024? From c10 kilobytes to 12 megabytes?
Compressing small fields can cause the index to get bigger rather than
smaller but obviously not by that much.
--
Ian.
On Wed, Feb 3, 2010 at 11:01 AM, Suraj Pari
Ian,
Small correction made ...
Thanks for solving my previous problems.
Now i tested the compression with 100 docs and found:
1. Without Compression size of FS directory (on disk)= 10.8 KB
2. With Compression size of FS directory (on disk) = 12.0 MB
and with 500 docs:
1. Without Compres
Ian,
Thanks for solving my previous problems.
Now i tested the compression with 100 docs and found:
1. With Compression size of FS directory (on disk)= 10.8 KB
2. Without Compression size of FS directory (on disk) = 12.0 MB
and with 500 docs:
1. With Compression size of FS directory (on
In HotelDatabase project of lucene, Following code is written in performSearch
method of SearchEngine class.
Let queryString = "Located in the heart of paris"
Analyzer analyzer = new StandardAnalyzer();
IndexSearcher is = new IndexSearcher("index");
QueryParser parser = new QueryParser("content
Hi All,
On 17th February we'll host the first Dutch Lucene User Group Meetup.
This meet-up will be split into two parts:
- The first part will be dedicated to the user group itself. We'll have
an introduction to the members and have an open discussion about the
goals of the user group and th
For specific fields using a special TokenStream chain, there is no need to
write a separate analyzer. You can add fields to a document using a TokenStream
as parameter: new Field(name, TokenStream).
As TokenStream just create a chain from Tokenizer and all Filters like:
TokenStream ts = new Key
I think you'll have to write your own. Or just downcase the text
yourself first.
--
Ian.
On Tue, Feb 2, 2010 at 9:30 PM, java8964 java8964 wrote:
>
> Is there an analyzer like keyword analyzer, but will also lowering the data
> from lucene? Or I have to do a customer analyzer by myself?
>
>
You should probably be using your PerFieldAnalyzerWrapper in your
calls to QueryParser but apart from that I can't see any obvious
reason. General advice: use Luke to check what has been indexed and
read
http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3
If you read the javadocs and source for DefaultSimilarity you'll know
as much about it as I do, and see what the default is. To customize
it, write your own subclass as I said before.
--
Ian.
On Tue, Feb 2, 2010 at 7:56 PM, Phan The Dai wrote:
> Dear Lan Lea,
> Thanks much for your reply.
> P
28 matches
Mail list logo