Askar,
why do you need to add +id:?
thanks,
dt,
www.ejinz.com
search engine news forms
- Original Message -
From: "Askar Zaidi" <[EMAIL PROTECTED]>
To: ; <[EMAIL PROTECTED]>
Sent: Wednesday, July 25, 2007 12:39 AM
Subject: Re: Fine Tuning Lucene implementation
Hey Hira ,
Thanks so mu
Please reference How do I get code written for Lucene 1.4.x to work with
Lucene 2.x?
http://wiki.apache.org/lucene-java/LuceneFAQ#head-86d479476c63a2579e867b
75d4faa9664ef6cf4d
Andy
-Original Message-
From: Lindsey Hess [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 25, 2007 12:31 PM
To
Hey Hira ,
Thanks so much for the reply. Much appreciate it.
Quote:
Would it be possible to just include a query clause?
- i.e., instead of just contents:, also add
+id:
How can I do that ?
I see my query as :
+contents:harvard +contents:business +contents:review
where the search phrase w
I'm trying to get some relatively old Lucene code to compile (please see
below), and it appears that Field.Text has been deprecated. Can someone please
suggest what I should use in its place?
Thank you.
Lindsey
public static void main(String args[]) throws Exception
{
I'm no expert on this (so please accept the comments in that context)
but 2 things seem weird to me:
1. Iterating over each hit is an expensive proposition. I've often
seen people recommending a HitCollector.
2. It seems that doBodySearch() is essentially saying, do this search
and return the
Inline below
On Jul 24, 2007, at 8:14 PM, Askar Zaidi wrote:
Sure.
public float doBodySearch(Searcher searcher,String query, int id){
try{
score = search(searcher, query,id);
}
catch(IOException io){}
Are you sure you are using the same Searcher for every search? Don't
open a new one unless you have modified the index. You are iterating
over every hit with the Hits class. You don't ever want to do this. Use
a HitCollector if you want to iterate over more than a hundred or so
hits. You will f
Sure.
public float doBodySearch(Searcher searcher,String query, int id){
try{
score = search(searcher, query,id);
}
catch(IOException io){}
catch(ParseException pe){}
Could you show us the relevant source from doBodySearch()?
-h
On Tue, 2007-07-24 at 19:58 -0400, Askar Zaidi wrote:
> I ran some tests and it seems that the slowness is from Lucene calls when I
> do "doBodySearch", if I remove that call, Lucene gives me results in 5
> seconds. otherwise it takes
Shall I setMergeFactor = 2 ?
Slow indexing is not a bother.
On 7/24/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> I ran some tests and it seems that the slowness is from Lucene calls when
> I do "doBodySearch", if I remove that call, Lucene gives me results in 5
> seconds. otherwise it takes ab
I ran some tests and it seems that the slowness is from Lucene calls when I
do "doBodySearch", if I remove that call, Lucene gives me results in 5
seconds. otherwise it takes about 50 seconds.
But I need to do Body search and that field contains lots of text. The field
is . How can I optimize that
Sorry, I mistyped. I don't mean the get methods, I mean the
doTagSearch, doTitleSearch, etc.
As for the stop watch, not really sure what to make of that... Try
System.currentTimeMillis()...
You can get just the fields you want when loading a Document by using
the FieldSelector API on
Hi,
I'm building an application that needs to translate one query format into
another. For example, my application generates the following query from a UI:
((title="Gone With The Wind") (title="Brave New World"))
and internally I need to convert it into this format so that I can make a web
s
Can someone please tell me how to cache results in Lucene ? I know the
classes, but I don't know how to go about it.
thanks,
Askar
On 7/24/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> Thanks for the reply.
>
> I am timing the entire search process with a stop watch, a bit ghetto
> style. My get
Thanks for the reply.
I am timing the entire search process with a stop watch, a bit ghetto style.
My getXXX methods are:
Document doc = hits.doc(i);
String str = doc.get("item");
So you can see that I am retrieving the entire document in a search query.
Ideally , I'd like to just retrieve the F
Where are you getting your numbers from? That is, where are your
timers? Are you timing the rs.next() loop, or the individual calls
to Lucene? What do the getX methods look like? How big are your
queries? How big is your index?
Essentially, we need more info to really help you. Fr
I have 512MB RAM allocated to JVM Heap. If I double my system RAM from 768MB
to say 2GB or so, and give JVM 1.5GB Heap space, will I get quicker results
?
Can I expect results which take 1 minute to be returned in 30 seconds with
more RAM ? Should I also get a more powerful CPU ? A real server cla
Hi, guys,
I found Analyzers for Japanese, Korean and Chinese, but not stemmers;
the Snowball stemmers only include European languages. Does stemming
not make sense for ideograph-based languages (i.e., no stemming is
needed for Japanese, Korean and Chinese)?
Also for spell checking, does the defau
Hey Guys,
>From what I understand, FieldCache is used to store only the field required
for search. I am using a Document object and then using doc.get("item"). One
of my fields is HUGE, so using Document will slow things down.
How can I use FieldCache ? an example ?
thanks,
AZ
Hey Guys,
I just finished up using Lucene in my application. I have data in a database
, so while indexing I extract this data from the database and pump it into
the index. Specifically , I have the following data in the index:
where itemID is just a number (primary key in the DB)
tags : te
Got it,
I don´t have a clue if this corruption was caused by hardware failure,
but that is possible because we suffer with a lot of power failures from
time to time. But the thing is that I´ve been using lucene for a long time
and I never got this kind of exception.
The thing is that I´d l
On 7/24/07, Rafael Rossini <[EMAIL PROTECTED]> wrote:
I did a litle debug and found that in the TermScorer, the byte[] norms has
size = 1.119.933, wich is the number of docs on my index, and there is a
docID = 1226511, that is if the "doc" variable in the method is the docID.
I tried to access t
I did a litle debug and found that in the TermScorer, the byte[] norms has
size = 1.119.933, wich is the number of docs on my index, and there is a
docID = 1226511, that is if the "doc" variable in the method is the docID.
I tried to access this document with reader.document() and got a *
java.io
I figured out the problem. The issue had nothing to do with Lucene 2.2. I
had accidentally reset the default mergeFactor to 1000. This was the reason
it was not merging the segments. With the default mergeFactor, the indexing
is working perfectly fine.
Thanks,
Harini
On 7/24/07, Michael McCandle
daniel rosher wrote:
Perhaps you can use a filter in the following way.
-Create a filter (via QueryFilter) that would contain all document that
do not have null values for the field
Interesting: what does the QueryFilter look like? Isn't it just as hard
as finding out what docs have the null
I don´t know the exact date of the build, but it is certainly before July 4,
and before the LUCENE-843 patch was committed. My index has 1.119.934 docs
on it and is about 8.2G.
I really don´t know how to reproduce this, the only query that I get this
error, so far, is "brasil"... and I don´t know
Nobody can answer that question, you have to test in your particular
situation. Filters are very efficient to use once created, can be created
once and used often, etc.
Adding a special value to stand for an empty field is conceptually
simple, and queries are straight forward.
Unless you can dem
That looks spooky. It looks like either the norms array is not
large enough or that docID is too large. Do you know how many
docs you have in your index?
Is this easy to reproduce, maybe on a smaller index?
There was a very large change recently (LUCENE-843) to speed
up indexing and it's possi
You'll also find lots of discussion about indexing multiple
languages if you search the mail archive for things like multiple
language.
I think one thing you're missing is that Lucene indexes data however
you tell it to. You have both total control over and total responsibility
for how things are
Hello all,
I´m using solr in an app, but I´m getting an error that it might be a lucene
problem. When I perform a simple query like q = brasil I´m getting this
exception:
java.lang.ArrayIndexOutOfBoundsException: 1226511
at org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
at org
On 7/24/07, daniel rosher <[EMAIL PROTECTED]> wrote:
Perhaps you can use a filter in the following way.
-Create a filter (via QueryFilter) that would contain all document that
do not have null values for the field
-flip the bits of the filter so that it now contains documents that have
null valu
Would it be more efficient to create an additional inverted field where I
assign a value to that field only when the field I would like to search is
NULL?
daniel rosher wrote:
>
> Perhaps you can use a filter in the following way.
>
> -Create a filter (via QueryFilter) that would contain all d
On Jul 24, 2007, at 3:21 AM, Elie Choueiri wrote:
Hi
I'm new to searching and am trying to use Lucene to search English
& Arabic
documents. I've got a bunch of questions (hopefully you'll find some
interesting!) and am hoping someone's gone through some of them and
has some
answers fo
Perhaps you can use a filter in the following way.
-Create a filter (via QueryFilter) that would contain all document that
do not have null values for the field
-flip the bits of the filter so that it now contains documents that have
null values for a field
-Use the filter in conjunction with subs
Hi
I'm new to searching and am trying to use Lucene to search English & Arabic
documents. I've got a bunch of questions (hopefully you'll find some
interesting!) and am hoping someone's gone through some of them and has some
answers for me!
First, do I have to worry about the Arabic Analyz
35 matches
Mail list logo