Gaurav,
If you go to http://lucene.apache.org/ you will see a Tika tab there. It's
OSS. LIUS is either a part of Tika or is about to become a part of it.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Gaurav Sharma <[EMAIL PROTECTED]>
>
Hi,
I think you mentioned 225GB of data somewhere.
You can open IndexReaders "on demand", but that's not a cheap operation, esp.
not with so much data. You want to keep your IndexReaders opened for a while.
Multiple requests/threads can share them.
Otis
--
Sematext -- http://sematext.com/ --
You should look into SecondString perhaps then, like Grant said.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Sangrish <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Friday, June 20, 2008 1:45:52 PM
> Subject: Re: Arbitrary
correct, adding new syntax to the parser currently requires editing the
grammer.
Something else you might consider is that ifyou expect "BESTOF" type
queries to be the default behavior people want, you could just overriget
the getBooleanQuery method of hte QUeryarser and *always* generate a
Anshum wrote:
Hey Andrzej,
Could you tell me as to what research suggests this and why is it this way?
My calculation says the average load on each server would go down as I would
know what server to query for an index term as opposed to querying all
servers for terms.
I'm looking for a solution
is there any way i can find example of a program using NGramSpeller.java
--
View this message in context:
http://www.nabble.com/Example-using-NGramSpeller.java-tp18034945p18034945.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---
Yes, "MoreLikeThis" is more like what I want.
But theres one problem. Even here one has to run the query against an
indexed set of documents.
While I would like to create two Queries through "MoreLikeThis" and get a
score of how similar they are to each other.
Siddharth
Otis Gospodnet
A couple of questions:
1> I assume by "not returning any docs" you mean that you
never get into your while loop. Is that true?
2> I'm a little suspicious of the field labeled "id" and whether
it's at all possible that this is getting confused with the
internal Lucene doc ID. This is a
Hello there! I trying to query for a specific document on a efficient way.
My index is structured in a way where I have an id field which is a unique
key for the whole index. When I'm updating/removing a document I was
searching for my id using a Searcher and a TermQuery. But reading the list
it se
U, have you tried reading any of the info on the home page? See:
http://lucene.apache.org/java/2_3_2/gettingstarted.html
I'd also recommend "Lucene in Action"
Best
Erick
On Fri, Jun 20, 2008 at 10:58 AM, jnance <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I am new to Lucene. I have several text
I'm using this to build a static index of documents and terms. A
snapshot requested for further client (third party) analysis.
regards,
Gerardo
Erick Erickson wrote:
What's the high-level goal here? The reason I ask is that
I'm not sure what *use* these scores are to you. Perhaps
someone will
Hi,
I am new to Lucene. I have several text files I would like to index and
search. How do I do this?
Thanks,
jnance
--
View this message in context:
http://www.nabble.com/Indexing-and-searching-txt-files-tp18031330p18031330.html
Sent from the Lucene - Java Users mailing list archive at Nabbl
Anshum wrote:
Hey Andrzej,
Could you tell me as to what research suggests this and why is it this way?
My calculation says the average load on each server would go down as I would
know what server to query for an index term as opposed to querying all
servers for terms.
I'm looking for a solution
i think u can use solr to solve it.
u just merge ur search result from 2 solr Instance(2 indexes).
it is very simple and u can distribute it.
On Wed, Jun 18, 2008 at 9:12 PM, Anshum <[EMAIL PROTECTED]> wrote:
> I have 2 indexes and I would like to move index for a few 'selected' and
> 'specifie
Hey Andrzej,
Could you tell me as to what research suggests this and why is it this way?
My calculation says the average load on each server would go down as I would
know what server to query for an index term as opposed to querying all
servers for terms.
I'm looking for a solution wherein I could
Hey Otis,
Could you suggest a few good distributed (lucene) search solutions? (Open
Source)
Yes, I do want to split by terms as the math tells a story. :)
TF IDF would be handled separately. I'd just use a different cluster of
machines to store the index instead of having the search run on the sam
Otis Gospodnetic wrote:
Hi,
Not doable with Lucene as far as I know. I'm not even certain you
would want to split by term. What would that do TF IDF in your
distributed search? What's wrong with splitting t the doc level?
There are about half a dozen distributed (Lucene) search solutions
floa
17 matches
Mail list logo