Hi Otis,
Thanks for the feedback.
2008/6/11 Otis Gospodnetic <[EMAIL PROTECTED]>:
> Hi Glen,
>
> Aha, good to see the benefit of multiple IndexReaders/Searchers so clearly.
> Makes me think we'll want to add a config setting for this in Solr... :)
Until then, you might want to use: Runtime.ava
Hi Glen,
Aha, good to see the benefit of multiple IndexReaders/Searchers so clearly.
Makes me think we'll want to add a config setting for this in Solr... :)
As for why 4 is the best choice, I think it's because of those 4 cores that
you've got. My guess is that you'll see slightly better per
I have extended my evaluation (previous evaluation:
http://zzzoot.blogspot.com/2008/06/simultaneous-threaded-query-lucene.html)
to include as well as an increasing # of threads performing concurrent
queries, 1,2,4 and 8 IndexReaders.
The results can be found here:
http://zzzoot.blogspot.com/2008/0
Dear all,
To improve the search, I will have to do keyword expansion. I am looking for
a library that would help me to get the list of synonym of a term with some
similarity score. Is there any lib package that can handle this? It would be
great if it is in Python. I have searched the web and foun
yes, figured it out. thanks.
how about checking for uniqueness?
Best.
On Wed, Jun 11, 2008 at 5:39 PM, Karl Wettin <[EMAIL PROTECTED]> wrote:
>
> 11 jun 2008 kl. 16.04 skrev Cam Bazz:
>
>>
>> When you look at the fields of a document with Luke, there is a norm
>> column.
>> I have not been able
if you have many terms across the fields, you might want to invoke
IndexReader's setTermInfosIndexDivisor() method, which would
reduce the in memory term infos used to lookup idf, but a (slightly)
slower search.
> From: [EMAIL PROTECTED]
> To: java-user@lucene.apache.org
> Subject: Re: Is it po
11 jun 2008 kl. 16.04 skrev Cam Bazz:
When you look at the fields of a document with Luke, there is a norm
column.
I have not been able to figure out what that is.
Norms is the 8 bit discretization of length normalization and field
boost combined.
See IndexReader#norms, Similarity#leng
Thanks Erick. That is what I was assuming but couldn't confirm if it was
worth going down those paths to acheive what I was hoping. Your essay was
very informative about realistic expectations with the fieldselector.
I actually just got through reading the discussion on deprecating hits which
ess
<<>>
I infer from this that you're using a Hits object to get your IDs to insert
in
your temporary table. Here's the problem with Hits... It re-executes
the query every 100 (200?) hits. So you can think of it as
while (more hits) {
if ((count % 100) == 0) execute the search and throw away the
Hello,
When you look at the fields of a document with Luke, there is a norm column.
I have not been able to figure out what that is.
The reason I am asking is that I am trying to build a uniqueness model. My
Index is structured as follows:
classID, textID, K, V
classID is a given class. textID
11 jun 2008 kl. 09.14 skrev Paul Elschot:
Op Wednesday 11 June 2008 01:41:38 schreef Karl Wettin:
Each of my filters represent single boosting term queries. But when
using the filter instead o the boosting term query I loose the score
(not sure this is true) and payload boost (if any), both ess
karl wettin-3 wrote:
>
>
> I might be missing something here -- can't you just add the age field
> to the index and include that in your query?
>
>
Thanks for the response Karl:
I just used the age field as an example, but in reality the structured data
is copious and complex relationshi
11 jun 2008 kl. 09.38 skrev Johannes Christen:
That might be a solution in this case, but I have the same kind of
problem in another case.
We index documents from an NTFS source. One field is the URI of the
document.
After a query has been processed, we perform an access check on the
hits
Yep, using a FieldSelector you can restrict the fields that will be loaded, you
can also specify how fields should be loaded (normal, lazy or load the field,
and then stop loading the document, i.e. skip other fields).
-Original Message-
From: Marcelo Schneider [mailto:[EMAIL PROTECTED]
Daan de Wit escreveu:
But I doubt this will solve your memory issue because nonstored fields are not
read when retrieving the document.
Thanks for the fast reply Daan! Just for clearance, if I had all the
code fields (filters) stored, then it would make any difference?
-Original Mes
For the record, Hits.id(int i) returns the document number. Note,
though, that Hits is now deprecated, as pointed out by the link to
1290, so going the TopDocs route is probably better anyway.
-Grant
On Jun 11, 2008, at 7:43 AM, Daan de Wit wrote:
This is possible, you need to provider a F
But I doubt this will solve your memory issue because nonstored fields are not
read when retrieving the document.
-Original Message-
From: Daan de Wit [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 11, 2008 13:44
To: java-user@lucene.apache.org
Subject: RE: Is it possible to get only on
This is possible, you need to provider a FieldSelector to
IndexReader#document(docId, selector). This won't work with Hits though,
because Hits does not expose the document number, so you need to roll your own
solution using TopDocs or HitCollector, for information see the discussion in
this is
Hi Shalin,
I am not familiar with Solr. I just know that it is a search server. Can you
please point me to some resources on how can I use Solr to solve the
situation?
Kalani
On Tue, Jun 10, 2008 at 5:03 PM, Shalin Shekhar Mangar <
[EMAIL PROTECTED]> wrote:
> Hi Kalani,
>
> Are you aware of Ap
I have a environment where we have indexed a DB with about 6mil entries
with Lucene, and each row has 25 columns. 20 cols have integer codes
used as filters (indexed/unstored), and the other 5 have (very) large
texts (also indexed/unstored). Currently the search I'm doing is like this:
Hits hi
On Jun 11, 2008, at 6:00 AM, Michael McCandless wrote:
Grant Ingersoll wrote:
Is more than one thread adding documents to the index?
I don't believe so, but I am trying to reproduce. I've only seen
it once, and don't have a lot of details, other than I noticed it
was on a specific fil
Thanks for you replay!> Date: Wed, 11 Jun 2008 09:19:46 +0200> From: [EMAIL
PROTECTED]> Subject: RE: The performance of lucene searching(web entironment)
test> To: java-user@lucene.apache.org> > On Wed, 2008-06-11 at 00:17 +0800,
lutan wrote:> > In my test case , I start loadrunner jsut test fo
Grant Ingersoll wrote:
Is more than one thread adding documents to the index?
I don't believe so, but I am trying to reproduce. I've only seen
it once, and don't have a lot of details, other than I noticed it
was on a specific file (.fdt) and was wondering if that was a
factor or not.
That might be a solution in this case, but I have the same kind of problem in
another case.
We index documents from an NTFS source. One field is the URI of the document.
After a query has been processed, we perform an access check on the hits to
ensure the user has access rights to open the docu
On Wed, 2008-06-11 at 00:17 +0800, lutan wrote:
> In my test case , I start loadrunner jsut test for 5 minute,and the response
> growth slowly.the TPS(transactions per second) seems stoped at 10 finally.
That's without reusing the searcher, right? In that case the increased
rate must be attribute
Op Wednesday 11 June 2008 01:41:38 schreef Karl Wettin:
> Each of my filters represent single boosting term queries. But when
> using the filter instead o the boosting term query I loose the score
> (not sure this is true) and payload boost (if any), both essential
> for the quality of my results.
26 matches
Mail list logo