Hi,
Is it possible to integrate Nutch into MS Search Server via OpenSearch API?
(MS Search Server support Open Search:
http://www.microsoft.com/enterprisesearch/serverproducts/searchserver/features.aspx
)
I think it should be possible to pass user query from MS server to Nutch and
integrate Nutch
Hello Everyone,
I have a query regarding Spell Checker. I created the spell index using the
following code
SpellChecker spellChecker = new SpellChecker(spellDir);
spellChecker.indexDictionary(dictionary);
This works perfectly. But is there any way in which I can dynamically add
records to the sp
Hi guys,
Some problems confuse me. When I would like to index some data from a
table in database. While I create the index on this table, the searching
job keeps going . How can I work out it?
By the way, the number of data is around 1 hundred million.
--
Best Regards
Cooper Geng
Let me qualify my question:
Sort is not working for a field that I stored :
document.add(new Field(FIELD_RECEIVED, DateTools.timeToString(
System.currentTimeMillis(),
DateTools.Resolution.SECOND),
Field.Store.NO, Field.Index.UN_TOKENIZED));
using
: Trying config file at path /var/www/.lsearch.conf
: Trying config file at path /usr/local/search/ls2/lsearch.conf
: 0[main] INFO org.wikimedia.lsearch.util.UnicodeDecomposer - Loaded
unicode
: decomposer
: java.rmi.ConnectIOException: non-JRMP server at remote endpoint
: at sun.rmi.tran
I think your right, and thats not the only place...the whole handling of
maxDocBytesToAnalyze in the main Highlighter class shares this issue. I
guess the idea is an ascii holdover one byte equals one char? I am sure
Mark H can clear it up, but don't forgot the maxDocBytesToAnalyze part
as well
I was looking at the SimpleFragmenter in contrib/Highlighter and was
wondering about the fragmentSize value. It says the value is the
number of bytes, but looking at the code it's using the String offset,
right? So it should be the number of characters, right?
I can fix it, just wanted to
Hi all,
I just uploaded Lucene 2.3 RC3 to:
http://people.apache.org/~buschmi/staging_area/lucene_2_3/
RC3 fixes a problem in the indexer that could cause it to hang after a
disk full exception occurred. (see
https://issues.apache.org/jira/browse/LUCENE-1130 for details).
Please switch to RC3 and
Does Lucene spell checker have the ability to suggest splitting of combined
words. So for e.g. if I have got the word "apple" and "computer" in my
index and if I type "applecomputer" then how can I make it suggest
"apple computer"
--
View this message in context:
http://www.nabble.com/spell-che
No problem Erick. Thanks for clarifying it.
Alex
-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: Monday, January 14, 2008 12:35 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene sorting case-sensitive by default?
Sorry, I was confused about this for the long
Sorry, I was confused about this for the longest time (and it shows!). You
don't actually have to store two separate fields. Field.Store.YES stores
the input exactly as is, without passing it through anything. So you
really only have to store your field. I still think of it conceptually as
two
enti
Thanks a lot Erik for the great tip! I do need to display all the fields
and allow the users to sort by each field as they wish. My index is
currently about 200 mb.
Your suggestion about storing (but not index) the cased version, and
indexing (but not store) the lower-case version is an excellent
Several things:
1> do you need to display all the fields? Would just storing them
lower-case work? The only time I've needed to store fields case-
sensitive is when I'm showing them to the user. If the user is just
searching on them, I can store them any way I want and she'll never
know.
2> You m
OK, I think I'm getting a better handle here. I can't imagine how
it would work to combine indexes that use *different* analyzers
on the *same* field. Regardless of what Lucene did, you
simply could NOT explain this to a user. To take a simple example,
index part of your data for field1 with Keywor
Yeah I think what u need is one Filed where you store a list of propertytag
and value combination and also be able to search on the filed
on values and identify that the value is for a particular propertytag.
something like
propertytag1, value
propertytag2,value
propertytag3,value etc
To be fran
Thanks everyone for your replies! Guess I did not fully understand the
meaning of "natural order" in the Lucene Java doc.
To add another all-lower-case field for each sortable field in my index
is a little too much, since the app requires sorting on pretty much all
fields (over 100).
Toke, you me
> Then why would you want to combine them?
>
> I really think you need to explain what you're trying to accomplish
> rather then obsess on the details.
I have to create indexes in parallel because the amount of data is very
high.
Then I want to merge them into bigger indexes an move them to the s
Why am I afraid?
Building the index wouldn't be a problem. I guess.
Querying it would be more difficult.
Let's see.
Custom properties... defined by the user, there is no restriction, in quantity,
and values.
> > Custom property name: Frequency> > Custom property value: Quarterly> >> >
> > Cus
> You can answer an awful lot of this much faster than waiting
> for someone
> to reply by getting a copy of Luke and look at the parse results using
> various
> analyzers.
Ah cool, you mean the "explain structure" button.
> Try KeywordAnalyzer for your query.
>
> Combine queries programmatica
Then why would you want to combine them?
I really think you need to explain what you're trying to accomplish
rather then obsess on the details.
Erick
On Jan 14, 2008 10:17 AM, <[EMAIL PROTECTED]> wrote:
> > I admit I've never used IndexMergeTool, I've always used
> > IndexWriter.AddIndexex and
> I admit I've never used IndexMergeTool, I've always used
> IndexWriter.AddIndexex and then execute
> IndexWriter.optimize().
>
> And I've seen no problems. That call takes no
> analyzer.
So you take the first index an add a remaining indexes via addIndexes?
What happens if the indexes were crea
You can answer an awful lot of this much faster than waiting for someone
to reply by getting a copy of Luke and look at the parse results using
various
analyzers.
And you can use query.toString() to see the parsed results as well.
Try KeywordAnalyzer for your query.
Combine queries programmatica
I admit I've never used IndexMergeTool, I've always used
IndexWriter.AddIndexex and then execute
IndexWriter.optimize().
And I've seen no problems. That call takes no
analyzer.
Erick
On Jan 14, 2008 6:12 AM, <[EMAIL PROTECTED]> wrote:
> > See org.apache.lucene.misc.IndexMergeTool
>
> Thank you.
I am not sure why you are afraid of adding more fields to the document.
Having 20-30 fields to a document is not a bad thing to do. Do you have any
constraints to limit the number of fields in the document?
On Jan 14, 2008 7:59 AM, Roger Camargo <[EMAIL PROTECTED]> wrote:
> Thanks for ans
Thanks for answering.
It seems that there isn't any other way around, having every combination of
dimension and level.
The example for the observations of the dimension, would be as follow, maybe
isn't such an important information to be stored, but type it is.
Dimension name: RegionDimensi
> The caution to use the same analyzer at index and query time is,
> in my experience, simply good advice to follow until you are
> familiar enough with how Lucene uses analyzers to keep from
> getting really, really, really confused. Once you understand
> when analyzers are used and how they effec
> > How can I search for fields stored with Field.Index.UN_TOKENIZED?
>
> Use TermQuery.
>
> > Why do I need an analyzer for searching?
>
> Consider a full-text field that will be tokenized removing special
> characters and lowercased, and then a user querying for an uppercase
> word. The
> OG: again, it depends. If the index you'd get by merging is
> of manageable size, then merge your indices.
OK, this is what I tought.
A single index should be faster than multiple indexes with a MultiSearcher,
right?
But what about the ParallelMultiSearcher? As I understand the docs it
searc
> See org.apache.lucene.misc.IndexMergeTool
Thank you.
But this uses a hardcoded analyzer and deprecated API-Calls.
How does the used analyzer effect the merge process?
Is everything reindexed with this new analyzer again? Does this make sense?
What if the sources indexes had other analyzers us
On Fri, 2008-01-11 at 11:40 -0500, Alex Wang wrote:
> Looks like Lucene is separating upper case and lower case while sorting.
As Tom points out, default sorting uses natural order. It's worth noting
that this implies that default sorting does not produce usable results
as soon as you use non-ASCI
30 matches
Mail list logo