To add more information
I am then wanting to search this field using part or all of the path
using wildcards
i.e.
Search category_path with /Top/My Prods*
Hi java-users
I need some help.
I am indexing categories into a single field category_path
If there is a phrase in search, the highlighter highlights every word
separately..
Like this :
I love Lucene
Instead what I want is like this :
I love Lucene
Is there a way to ask Lucene do this ? I know we could ask css or jquery to
do the task, but whats the point ? right ?
So, is
Dear Mike:
Thanks to your help, and apologise for delayed reply.
Yes, the execption still ocuur in 3.1 and the index is now being rebuilt for
3.1.
In the index, only the term '\1' has payload, If the search switch to other
terms, the exception doesn't raise.
If you can send me a new recent co
We are using version 3.0.3. So you can confirm that closing the writer (and
the reader created from that writer) should be enough to release the file
handles?
If that is the case, that means our application has a bug somewhere that i
need to track down.
Thanks,
JB.
On 5 April 2011 19:48, Ian Lea
Hi Shambhu,
ShingleFilter will construct word n-grams:
http://lucene.apache.org/java/3_1_0/api/contrib-analyzers/org/apache/lucene/analysis/shingle/ShingleFilter.html
Steve
> -Original Message-
> From: sham singh [mailto:shamsing...@gmail.com]
> Sent: Tuesday, April 05, 2011 5:53 PM
> T
Hi All,
I have to do tokenization which is combination of NGram and Standard
tokenization
for ex if the content is :"the quick brown fox jumped over the lazy dog"
requirement is to tokenize into:
quick brown fox
brown fox jumped
fox jumped over etc
..
..
Please help me to find out best analyzer
Hi java-users I need some help. I am indexing categories into a single field category_path Which may contain items such as /Top/Books,/Top/My Prods/Book Prods/Text Books, /Maths/Books/TextBooks
i.e. category paths delimited by ,
I want to store this field, so the Analyser tokenizes the document
Try 1) reducing the RAM buffer of your IndexWriter
(IndexWriter.setRAMBufferSizeMB), 2) using a term divisor when opening
your reader (pass 2 or 3 or 4 as termInfosIndexDivisor when opening
IndexReader), and 3) disabling norms or not indexing as many fields as
possible.
70Mb is not that much RAM t
Hi,
I'm indexing a time stamp for every document, using a Numeric Field.
Was wandering if there's a correct way to find the first document newer then
a specific date (say 3 weeks ago).
I know I can perform a range search for a range starting 3 weeks ago and
ending now, but was wandering if there i
Hi,
I am using Lucene 2.9.4 with FSDirectory.
My index has 80 thousand documents (each document has 12 fields).
My jvm has 70Mb of RAM memory (limited by my hosting).
I am getting various OutOfMemoryError.
I ran jmap and I got:
num #instances #bytes Class description
-
Hi,
I am using Lucene 2.9.4 with FSDirectory.
My index has 80 thousand documents (each document has 12 fields).
My jvm has 70Mb of RAM memory (limited by my hosting).
I am getting various OutOfMemoryError.
I ran jmap and I got:
num #instances #bytes Class description
-
Hi,
I am using Lucene 2.9.4 with FSDirectory.
My index has 80 thousand documents (each document has 12 fields).
My jvm has 70Mb of RAM memory (limited by my hosting).
I am getting various OutOfMemoryError.
I ran jmap and I got:
num #instances #bytes Class description
-
This (HashDocSet, and any other impls that handle the sparse case
well) could be useful to have in Lucene's core.
For example, for certain MultiTermQuerys we have this
CONSTANT_SCORE_AUTO_REWRITE, which has iffy smelling heuristics to try
to determine the best cutover point from
ConstantScoreQuer
On Tue, Apr 5, 2011 at 2:24 AM, Antony Bowesman wrote:
> Seems like SortedVIntList can be used to store the info, but it has no
> methods to build the list in the first place, requiring an array or bitset
> in the constructor.
It has a constructor that takes DocIdSetIterator - so you can pass an
I think Solr has a HashDocSet implementation?
On Tue, Apr 5, 2011 at 3:19 AM, Michael McCandless
wrote:
> Can we simply factor out (poach!) those useful-sounding classes from
> Nutch into Lucene?
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Tue, Apr 5, 2011 at 2:24 AM, Antony Bowesman
> w
On Tue, Apr 5, 2011 at 10:06 AM, Shai Erera wrote:
> Can we use TermEnum to skip to the first term 'after 3 weeks'? If so, we can
> pull the first doc that appears in the TermDocs of that Term (if it's a
> valid term).
Yep. Try this to get the term you want to use to seek:
BytesRef term
Hi
We have a date field which is indexed as NumericField and we'd like to
get the first docid that is since 3 weeks ago. Currently we're doing
something like this:
{code}
Query q = NumericRangeQuery.newLongRange("date", timeBefore3Weeks,
System.currentTimeMillis(), true, true);
Scorer s = q.weigh
You don't say exactly how you are dealing with the concurrent access
(one shared Reader/Searcher? Each user with own Reader/Searcher?
Something else?) but the underlying problem is that the reader has
been closed while something else is still using it. This can easily
happen in a multi-threaded se
Hi Tanuj,
Can you be more specific?
What file did you download? (Lucene 3.1 has three downloadable packages:
-src.tar.gz, .tar.gz, and .zip.)
What did you expect to find that is not there? (Some examples would help.)
Steve
> -Original Message-
> From: Tanuj Jain [mailto:tanujjain.
Hi,
I have downloaded lucene 3.1 and want to use in my program.
I found lot of files that differ/missing from lucene 3.0. Is there any way I
could get those files as a whole rather than searching for each file and
downloading it.
Hi
My application is cluster in jobss application servers & lucene
directory was shared.
Concurrently 5 user access same lucene directory for searching document
That time I got bellow exception
org.apache.lucene.store.AlreadyClosedException: this IndexReader is
closed
is there a way
Can we simply factor out (poach!) those useful-sounding classes from
Nutch into Lucene?
Mike
http://blog.mikemccandless.com
On Tue, Apr 5, 2011 at 2:24 AM, Antony Bowesman wrote:
> I'm converting a Lucene 2.3.2 to 2.4.1 (with a view to going to 2.9.4).
>
> Many of our indexes are 5M+ Documents,
Which version of lucene are you using? Something changed in the 3.x
release, and maybe 2.9.x, in the way that old file handles are closed.
Previously it wasn't always necessary to explicitly close everything,
now it is.
Your usage sounds fine to me. When I've hit too many open files when
using
Yeah, that mergeFactor is way too high and will cause
too-many-open-files (if the index has enough segments).
Also, you should setRamBufferSizeMB instead of maxBufferedDocs, for
faster index throughput.
Calling optimize from two threads doesn't help it run faster when
using ConcurrentMergeSchedul
Can you use PerFieldAnalyzerWrapper? That would be the normal way to
approach this, specifying a different, synonym aware, analyzer for the
relevant field(s).
--
Ian.
On Mon, Apr 4, 2011 at 11:31 PM, Christopher Condit wrote:
> I need to add synonyms to an index depending on the field being in
Hi all,
I have been looking for information about this and found a few things here
and there but nothing very clear on when files are opened and closed by
Lucene.
We have an application that uses Lucene quite heavily in the following
fashion: there are multiple indexes in use at all times. For ea
26 matches
Mail list logo