Yes. Every time a user updates a piece of information, you do the update in
the DB as well as the Index. If you are using Hibernate, they have an API
that does this mapping. I am not sure why you plan to store data in the
Index ?? Storing data is the DBs job, searching is the Index job. I would
sug
1) I don't understand why the index would get corrupted. We store huge data
and meta-data using Lucene.
2) For this, I synced Lucene with the DB operations. If you use Hibernate,
theres an API for that. Or, you could just write your own factory methods to
add/delete/edit index documents when a DB o
I have indexed around 100 M of data with 512M to the JVM heap. So that gives
you an idea. If every token is the same word in one file, shouldn't the
tokenizer recognize that ?
Try using Luke. That helps solving lots of issues.
-
AZ
On 9/1/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
>
> I can't
Thats exactly what I do. The moment something is added to the database , I
add it to the lucene index of the user. Upon new account creation, open a
new lucene index for this new user. Whenever something is uploaded, just add
it to the index.
-
Askar
On 8/22/07, Ard Schrijvers <[EMAIL PROTECTED]>
Hey Guys,
I am trying to do something similar. Make the content search-able as soon as
it is added to the website. The way it can work in my scenario is that , I
create the Index for a every new user account created.
Then, whenever a new document is uploaded, its contents are added to the
users I
Hey Guys,
Quick question:
I do this in my code for searching:
queryParser.setDefaultOperator(QueryParser.Operator.AND);
Lucene is OR by default so I change it to AND for my requirements. Now, I
have a requirement to do OR as well. I mean while doing AND I'd like to
include results from OR too .
Guys,
Heres someone who did this hack:
http://blog.mindbridge.com/?p=55
Cheers,
AZ
On 7/31/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> I'll have to use StringBuffer and get the Explanation in it as a String.
> Then parse StringBuffer to get the scores of each field, then
field ? I came across
Boosting of a term in the query so that would mean,
"apache^4 jakarta"
This means I am more keen to find apache than jakarta. I am keen to boost
the score of a field, how can that be done ?
On 7/31/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> Using
ied with the current scoring .
>
> All I can say is try it and find out. You might consider using Luke
> to try various boosts without having to mess with too much code.
>
> Erick
>
> On 7/31/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
> >
> > Boosting
other 3 fields ? Will that help
? Would there be a way to bring down the score of the contents field ?
thanks,
AZ
On 7/31/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
>
> Wouldn't boosting handle this for you?
>
> On 7/31/07, Askar Zaidi <[EMAIL PROTECTED]> wrote
To be more specific:
I want to retrieve the scores of individual fields inside a document so that
I can manipulate the score of one field. This is the requirement of my
application. After the manipulation I can add these scores and then show the
total.
thanks,
AZ
On 7/31/07, Askar Zaidi
Hi,
Does anyone know how to retrieve the score of an individual field instead of
doing:
hits = score(i); This will get me the entire score of the document. I'd like
to get the score of a single field by specifying the field name.
thanks,
AZ
On 7/31/07, Askar Zaidi <[EMAIL PROTECTED
Hey guys,
I was wondering if there is a way to retrieve score of a field in a document
?
If my document looks like this:
{itemID},{field 1},{field 2}
I'd like to get score of individual fields 1 and 2 rather than the score of
the entire document.
Is it possible ?
thanks,
AZ
I did this yesterday. Manually appended an extra field to the query. It
works fine.
On 7/26/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:
>
>
> On Jul 25, 2007, at 5:05 PM, Joe Attardi wrote:
> > As far as I can tell, I basically have two options:
> > (1) Manually prepend the field identifier to the
ll be active on this list from now on and try and answer
questions to which I was seeking answers.
later,
Askar
On 7/25/07, Doron Cohen <[EMAIL PROTECTED]> wrote:
>
> "Askar Zaidi" wrote:
>
> > ... Heres what I am trying to accomplish:
> >
> > 1. Iterate ov
k the one document I need from the Index and give me the
score. I don't have to iterate over Hits.
Any clues ? I can't find any examples on query building .
thanks !
Askar
On 7/25/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
>
> Yes, you can do that.
>
>
> On Jul 25
intln(query);
I get:
+contents:Harvard +contents:Business + contents: Review
Can I just add:
+contents:Harvard +contents:Business + contents: Review +itemID=id ??
That query would just return one document.
On 7/25/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> Instead of refact
ine the score for
> you
> }
>
> MemoryIndex info can be found at http://lucene.zones.apache.org:8080/
> hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/index/memory/
> package-summary.html
>
> -Grant
>
> On Jul 25, 2007, at 11:45 AM, Askar Zaidi wrote:
>
> &
gt; database? Or is it that you want to score the item based on some
> terms as well. If that is the case, there are other ways of doing
> this and we can discuss them.
>
> -Grant
>
> On Jul 25, 2007, at 10:10 AM, Askar Zaidi wrote:
>
> > Hey Guys,
> >
> > I n
a lot,
Askar
On 7/25/07, Dmitry <[EMAIL PROTECTED]> wrote:
>
> Askar,
> why do you need to add +id:?
> thanks,
> dt,
> www.ejinz.com
> search engine news forms
> ----- Original Message -
> From: "Askar Zaidi" <[EMAIL PROTECTED]>
> To: ; <
re about".
>
> What would break if you:
> 1. Included "creator" in the Lucene index (or, filtered out the Hits
> using a BitSet or something like it)
> 2. Executed 1 search
> 3. Collected the results of the first N Hits (where N is some
> reasonable limit, like
t hitCount = hits.length();
for(int i=0;i wrote:
>
> Could you show us the relevant source from doBodySearch()?
>
> -h
>
> On Tue, 2007-07-24 at 19:58 -0400, Askar Zaidi wrote:
> > I ran some tests and it seems that the slowness is from Lucene calls
> when I
> &g
Shall I setMergeFactor = 2 ?
Slow indexing is not a bother.
On 7/24/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> I ran some tests and it seems that the slowness is from Lucene calls when
> I do "doBodySearch", if I remove that call, Lucene gives me results in 5
>
where the slowness is. Please try to isolate the Lucene calls from
> the DB calls and look at the timings for both.
>
> On Jul 24, 2007, at 5:28 PM, Askar Zaidi wrote:
>
> > Thanks for the reply.
> >
> > I am timing the entire search process with a stop watch, a bit
&g
Can someone please tell me how to cache results in Lucene ? I know the
classes, but I don't know how to go about it.
thanks,
Askar
On 7/24/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> Thanks for the reply.
>
> I am timing the entire search process with a stop watch, a
kly, I surprised your slowdown is only linear.
>
> On Jul 24, 2007, at 4:31 PM, Askar Zaidi wrote:
>
> > I have 512MB RAM allocated to JVM Heap. If I double my system RAM
> > from 768MB
> > to say 2GB or so, and give JVM 1.5GB Heap space, will I get quicker
> &g
class
machine ?
I have also done some of the optimizations that are mentioned on the Lucene
website.
thanks,
AZ
On 7/24/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> Hey Guys,
>
> I just finished up using Lucene in my application. I have data in a
> database , so while indexin
Hey Guys,
>From what I understand, FieldCache is used to store only the field required
for search. I am using a Document object and then using doc.get("item"). One
of my fields is HUGE, so using Document will slow things down.
How can I use FieldCache ? an example ?
thanks,
AZ
Hey Guys,
I just finished up using Lucene in my application. I have data in a database
, so while indexing I extract this data from the database and pump it into
the index. Specifically , I have the following data in the index:
where itemID is just a number (primary key in the DB)
tags : te
}
This shows me the item value. Now I wanna see the score related to this
item, how do I get that?
thanks,
AZ
On 7/19/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
>
> QueryParser.setDefaultOperator
>
> On 7/19/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
> &g
> >
> > > Are you sure that the hit wasn't on "w" or "kim"? The
> > > default for searching is OR...
> > >
> > > I recommend that you get a copy of Luke (google lucene luke)
> > > which allows you to examine your index as well
; I started using Lucene yesterday, so I am fairly new !
> >
> > thanks
> > AZ
> >
> > On 7/18/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
> > >
> > > Are you sure that the hit wasn't on "w" or "kim"? The
how
queries parse using various analyzers. It's an invaluable tool...
Best
Erick
On 7/18/07, Askar Zaidi <[EMAIL PROTECTED]> wrote:
>
> Hey folks,
>
> I am a new Lucene user , I used the following after indexing:
>
> search(searcher, "W. Chan Kim");
>
&
Hey folks,
I am a new Lucene user , I used the following after indexing:
search(searcher, "W. Chan Kim");
Lucene showed me hits of documents where "channel" word existed. Notice that
"Chan" is a part of "Channel" . How do I stop this ?
I am keen to find the exact word.
I used the following, b
34 matches
Mail list logo