I used It to measure speed and but I was planning to use it in file
search application. when u need wildcard search like *.txt and so on.
The matter is that file search application is not my primary job, so I
will tune it later.
This is just an example to give you an idea how it can work.
reg
--- Tomcat Programmer <[EMAIL PROTECTED]> wrote:
>
> Hi Otis,
>
> Thanks for your answer on the integer issue. I was not
> sure if the index was actually limited, or if it was
> just the numDocs method call. I guess it really does
> not matter which it is; and for me, I don't think my
> index w
That sounds way too long, unless you have veeery slow disks, veeery
large Documents (long fields that you analyze, index, and store in
Lucene), or some such.
If you have very lng filds you could try setting
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexWriter.html#
Omar Didi writes (4/20/2005 5:05 PM):
Hi guys,
If a field is indexed as UnStored how can I get it value?
I tried document.get("UnStored_field") it returns null.
You didn't store it, so it's not there. If the field happens to be a
single Term, you might be able to find it in the index, expensiv
Hi,
I have similar issues in indexing time.
I am doing a SELECT from database and getting back
10,000 rows. I then start indexing each row and hence
would have 10,000 documents in my Lucene index. Each
doc has 27 fields.
I added some timing code to my indexing process. The
DB select call takes a
Hi guys,
If a field is indexed as UnStored how can I get it value?
I tried document.get("UnStored_field") it returns null.
thanks
-Original Message-
From: Kevin L. Cobb [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 20, 2005 8:52 AM
To: java-user@lucene.apache.org
Subject: RE: Best way
My policy on this type of exception handling is to only byte off what
you can chew. If you catch an IOException, then you simply report to the
user that an unexpected error has occurred and the search engine is
unobtainable at the moment. Errors should be logged and developers
should look at the sp
I don't agree with this if the query is expected to contain the same
"text-encoding" as the content being analyzed.
So one example would be matching
f is continuous since it is the product of g and x |-> x^2
(in "email notation", we work with a semantic encoding)
This combination of text and
Daniel Naber wrote:
On Wednesday 20 April 2005 18:22, Paul Elschot wrote:
Has anyone tried an index based on n-grams?
Nutch has bigrams for phrases with frequently occurring words.
Also the spell checker in SVN uses n-grams I think.
Yes, but Nutch uses word n-grams, whereas the spell checker use
Daniel Naber wrote:
On Wednesday 20 April 2005 18:22, Paul Elschot wrote:
Has anyone tried an index based on n-grams?
Nutch has bigrams for phrases with frequently occurring words.
Also the spell checker in SVN uses n-grams I think.
SVN here:
http://svn.apache.org/repos/asf/lucene/java/trunk/co
On Wednesday 20 April 2005 18:22, Paul Elschot wrote:
> > Has anyone tried an index based on n-grams?
>
> Nutch has bigrams for phrases with frequently occurring words.
Also the spell checker in SVN uses n-grams I think.
Regards
Daniel
--
http://www.danielnaber.de
--
You are right..
From: Volodymyr Bychkoviak [mailto:[EMAIL PROTECTED]
Sent: Wed 20-4-2005 18:12
To: java-user@lucene.apache.org
Subject: Re: What is going on with subversion.
IMHO QueryParser.DEFAULT_OPERATOR_AND and
QueryParser.DEFAULT_OPERATOR_OR should be
Hi,
> Also this analyzer is not used in any application, I
> wrote it only to
> measure search speed.
So you don't use the method you described for your
wildcard search trick?
Thanks,
Aalap.
-
To unsubscribe, e-mail: [EMAIL PR
On Wednesday 20 April 2005 14:04, Barbara Krausz wrote:
> Hi,
> currently I'm writing my Bachelorthesis about Lucene. I searched for
> theoretical information for example about the IR-model Lucene uses, but
> I couldn't find anything so I had to figure it out on my own.
> I think Lucene uses the
IMHO QueryParser.DEFAULT_OPERATOR_AND and
QueryParser.DEFAULT_OPERATOR_OR should be used instead QueryParser.AND
and QueryParser.OR
Peter Veentjer - Anchor Men wrote:
package com.jph.lucene.parsers;
/**
* Copyright 2004 The Apache Software Foundation
*
* Licensed under the Apache License, Versio
On Apr 20, 2005, at 9:47 AM, Peter Veentjer - Anchor Men wrote:
I guess and hope the original
MultifieldQueryParser in Lucene 2.0 will be of better design.
MFQP in Subversion is what you'll get unless someone supplies patches
to improve it.
Erik
--
The MultiFieldQueryParser is of terrible design. It looks like it
extends the QueryParser, but it doesn`t. There are only a few static
methods that restrict the functionality of the QueryParser a lot. That
is why I have created this util class, that does exactly the same job
and has a few extra fea
thanks.
Peter Veentjer - Anchor Men wrote:
package com.jph.lucene.parsers;
/**
* Copyright 2004 The Apache Software Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the Licens
package com.jph.lucene.parsers;
/**
* Copyright 2004 The Apache Software Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/
The problem with this approach is that the Analyser you will use for indexing
will be *very* different from the one used for searching.
The way I see it, the Document objects pqssed to Lucene should contain fields
that are as much text based as possible, comparable to what a user would type
whi
Sorry, I've already read about servers moving.
Can somebody mail me latest MultiFieldQueryParser.java and
highlighting source code. Because I can't get it from subversion and I
need it urgently.
Thanks in advance.
Regards,
Volodymyr Bychkoviak
Volodymyr Bychkoviak wrote:
I can't connect svn.ap
I can't connect svn.apache.org. It seems that apache.org is down.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi,
currently I'm writing my Bachelorthesis about Lucene. I searched for
theoretical information for example about the IR-model Lucene uses, but
I couldn't find anything so I had to figure it out on my own.
I think Lucene uses the vector space model with a variation of the
cosine measure (cosine
Aalap Parikh wrote:
Hi Volodymyr,
About the trick you described about wildcard search
replacement, you mentioned:
So I found following workaround. I index this field
as > sequence of terms, each of containing single
digit from > needed value. (For example I have
“123214213” value
that n
> It looks to me that if I do get an IOException, I will then have to perform a
> number of additional checks to eliminate the other possible causes of
> IOExceptions (such as permissions issues), and by a process of elimination,
> determine a corrupt index.
Slightly off-topic:
That's exactly
Hi,
The best way to determine bottlenecks is profiling. (JProfiler is very
good tool for that. It's commercial product with free evaluation)
I was indexing 1.5 million documents in 45 minutes.
before optimizing it took much more time to index. optimization was done
through 'select' query changin
On Wednesday 20 Apr 2005 08:27, Maik Schreiber wrote:
> > As the index is rather critical to my program, I just wanted to make it
> > really robust, and able to cope should a problem occur with the index
> > itself. Otherwise, the user will be left with a non-functioning program
> > with no explana
> As the index is rather critical to my program, I just wanted to make it
> really
> robust, and able to cope should a problem occur with the index itself.
> Otherwise, the user will be left with a non-functioning program with no
> explanation. That's my reasoning anyway.
You should perhaps go
On Tuesday 19 Apr 2005 22:37, Daniel Herlitz wrote:
> I would suggest you simply do not create unusable indexes. :-)
I agree! :) I am obviously very confident that my application is building
indexes correctly. I'm thinking of the rarer instances whereby user or system
error has caused a proble
29 matches
Mail list logo