Are you sure that you not forgot to commit your changes? Maybe that's the
reason you see only 32768 documents. There is no such low limit, the number of
documents is limited by Integer.MAX_VALUE, number of terms is much higher...
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://ww
Is there some sort of default limit imposed on the Lucene indexes?
I try to index 50k or 60k documents but when I use Luke to go inside
the index and check the total # of entries indexed, it shows that
there are only 32768 entries.
It seems liek some sort of limit ... what should I look at to adjus
I don't know if there is already an analyzer available for this, but you
could use GATE or UIMA for Named Entity Extraction against names and
expand the query to include the extra names that are used synonymously.
You could do this outside Lucene or inline using a custom Lucene
tokenizer that embed
It's great that the requirement is loose...
But I suppose users would ask for more later.
Well, I worked on DBSight, which covers more than just search. It also includes
scheduling indexing, reindexing, and even rendering.
In your case, you just need to specify a SQL and have index up and runnin
Hi,
I would like to build a search system where a search for "Dan" would also
search for "Daniel" and a search for "Will", "William" . Any ideas on how to go
about implementing that? I can think of writing a custom Analyzer that would
map these partial tokens to their full firstname or lastnam
As I look at the api for SmartChineseAnalyzer it indicates it is for Simplified
Chinese. Has anyone attempted modifying it for Traditional Chinese? Or does
anyone know of any other "smart" analyzer that is geared towards traditional
Chinese?
Thanks,
Ross
Hi,
I am using MultiFieldQueryParser with a custom analyzer for parsing search text.
Now, when I say
MultiFieldQueryParser qp = new MultiFieldQueryParser(Version, new String[]
{"field1", "field2", "field3"}, customAnalyzer);
qp.setDefaultOperator(QueryParser.AND_OPERATOR);
Query query = qp.p
Simplest solution is to wrap your findFeatures.reader in a
SlowMultiReaderWrapper (as the exception suggests).
More performant solution is to change your code to visit the
sequential sub-readers of findFeatures.reader, directly. But if
performance isn't important here, just do the simple solution
https://issues.apache.org/jira/browse/LUCENE-2756.
--
Ian.
On Thu, Mar 24, 2011 at 2:13 PM, Devon H. O'Dell wrote:
> 2011/3/24 Uwe Schindler :
>> Don't use MultiSearcher. Instead create a MultiReader around the separate
>> IndexReaders for each index and pass that MultiReader to a conventional
Care to define huge? There often isn't a "best" solution but in this
case I think I'd vote for the index-per-year approach.
btw with recent versions of lucene you don't need to call optimize()
very often, if at all, although you might want to run it at the
beginning of each year against the previ
2011/3/24 Uwe Schindler :
> Don't use MultiSearcher. Instead create a MultiReader around the separate
> IndexReaders for each index and pass that MultiReader to a conventional
> IndexSearcher as IndexReader. MultiSearcher is very buggy.
Could you elaborate on this point at all, Uwe? I'm using
Para
Don't use MultiSearcher. Instead create a MultiReader around the separate
IndexReaders for each index and pass that MultiReader to a conventional
IndexSearcher as IndexReader. MultiSearcher is very buggy.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u
Hi,
I need to search a Catalog.
Most users search *this* year's catalog, but on rare occasions they may ask
for old products (from previous years).
I'm trying to select between 2 options:
1) Keep huge big index for all years (where documents have a "year" field,
so I can filter out the current ye
Hi David,
thanks for your advice I'll keep it in mind.
Best regards,
François Jurain.
> Message du 22/03/11 à 17h40
> De : "David Causse"
> A : java-user@lucene.apache.org
> Copie à :
> Objet : Re: Wanted: a directory of quick-and-(not too)dirty analyzers for
> multi-language RDF.
>
Hey folks,
just a short notice for those who haven't noticed we have only a
limited amount of Early-Bird tickets left and the Early-Bird period is
ends on April 7th. If you want to get one of the 30 remaining tickets
go and get one now here: http://berlinbuzzwords.de/content/tickets
While we are
15 matches
Mail list logo