Hi all,
Firstly I have known that there is a FsDirectory class in Nutch-0.9 so
we can access the index on HDFS. But after I tested it, i found that we can
only read the index but can not to append or modify, I think the reason is
the one mentioned in the HDFS-file append issues, am I right?
I see, ok. Thanks to both of you!
On Thu, Aug 21, 2008 at 4:51 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:
>
> Also, the inverted index *will* store positional information (in the *.prx
> files) even if term vectors are not stored.
>
> Mike
>
>
> Yonik Seeley wrote:
>
> On Thu, Aug 21, 20
It was, after all an XML issue, the servlets creating the content that was
being indexed were not sending UTF but the XML declaration stated the code
WAS UTF, so it really was not a Lucene issue after all. Thanks for all the
help.
On Thu, Aug 21, 2008 at 6:18 PM, Juan Pablo Morales
<[EMAIL PROTECT
I was wondering if anyone could explain the following weird behavior that I'm
experiencing when boosting BooleanQuery's:
When I create a TermQuery, add it as a SHOULD clause to a BooleanQuery, and
boost that BooleanQuery, the boost shows up when I run
IndexSearcher.explain(). However, when I add
Also, the inverted index *will* store positional information (in the
*.prx files) even if term vectors are not stored.
Mike
Yonik Seeley wrote:
On Thu, Aug 21, 2008 at 7:20 PM, David Lee <[EMAIL PROTECTED]>
wrote:
Clarification question:
If I don't store term vectors, then I:
-- won't h
On Thu, Aug 21, 2008 at 7:20 PM, David Lee <[EMAIL PROTECTED]> wrote:
> Clarification question:
>
> If I don't store term vectors, then I:
> -- won't have information on the position of matching terms
> -- I don't have the term frequency vector
>
> -- but I should still have the frequency of terms
Clarification question:
If I don't store term vectors, then I:
-- won't have information on the position of matching terms
-- I don't have the term frequency vector
-- but I should still have the frequency of terms per document in the .frq
file, right?
So what's the difference between the term f
You are right, it does work. I'll look into my example to see where the
difference is.
On Thu, Aug 21, 2008 at 5:30 PM, Grant Ingersoll <[EMAIL PROTECTED]>wrote:
> Here's a unit test:
> import junit.framework.TestCase;
> import org.apache.lucene.analysis.snowball.SnowballAnalyzer;
> import org.ap
Here's a unit test:
import junit.framework.TestCase;
import org.apache.lucene.analysis.snowball.SnowballAnalyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWr
Nvm, Extremely goofy project configuration here and classpath issues
with much older versions. Ignore me!
-Original Message-
From: Jordon Saardchit [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 21, 2008 11:54 AM
To: java-user@lucene.apache.org
Subject: QueryParser Default Operator
T
This may have been answered before, but is there a reason why setting
the default operator on a QueryParser throws a
java.lang.NoSuchFieldError???
QueryParser parser = new QueryParser( "title", new TokenAnalyzerImpl()
);
parser.setDefaultOperator( QueryParser.AND_OPERATOR ); // This line
throws t
On Thu, Aug 21, 2008 at 12:47 PM, Steven A Rowe <[EMAIL PROTECTED]> wrote:
> Hola Juan,
Hi Steve
>
>
> On 08/21/2008 at 1:16 PM, Juan Pablo Morales wrote:
> > I have an index in Spanish and I use Snowball to stem and
> > analyze and it works perfectly. However, I am running into
> > trouble stor
Hola Juan,
On 08/21/2008 at 1:16 PM, Juan Pablo Morales wrote:
> I have an index in Spanish and I use Snowball to stem and
> analyze and it works perfectly. However, I am running into
> trouble storing (not indexing, only storing) words that
> have special characters.
>
> That is, I store the spe
I have an index in Spanish and I use Snowball to stem and analyze and it
works perfectly. However, I am running into trouble storing (not indexing,
only storing) words that have special characters.
That is, I store the special character but the it comes garbled when I read
it back.
To provide an e
On Wed, Aug 20, 2008 at 6:12 PM, Michael McCandless <[EMAIL PROTECTED]
> wrote:
>
> Aditi Goyal wrote:
>
> Thanks Mike. I found the problem.
>> The problem was that I was not converting the value of the fields to utf-8
>> and hence while adding it to doc it was getting stored as None.
>> So, when
Just to add to that, as I said before, in my case, I found more useful not
to use UN_Tokenized. Instead, I used Tokenized with a custom analyzer that
uses the KeywordTokenizer (entire input as only one token) with the
LowerCaseFilter: This way I get the best of both worlds.
public class KeywordLow
16 matches
Mail list logo