Hi all,
Indexing Hebrew texts for later retrieval is not a trivial task. Of all
languages, Hebrew seem to be the toughest to handle. Although several
solutions exist, they are not necessarily providing the best results in
terms of relevancy. Either way, there is no freely available solution
allowi
> I need to index HTML documents and one of the requirements
> is to highlight
> documents while maintaining all of the original formatting.
> The documents
> are relatively simple HTML, meaning no JavaScript code that
> changes elements
> at runtime or too fancy CSS styling.
>
> I think it should
Marketing blurb below. My personal hype here... I'm going to be
showcasing a straightforward document search engine, from files
through indexing through usable user interface in no time. Come check
it out. There'll be similarities to my EuroCon presentation[1],
though this will be an ent
Hi,
I need to index HTML documents and one of the requirements is to highlight
documents while maintaining all of the original formatting. The documents
are relatively simple HTML, meaning no JavaScript code that changes elements
at runtime or too fancy CSS styling.
I think it should be possible t
I'm looking for some clarification on the use of field cache in a real time
index situation.
We are using Lucene in a real time fashion, but we update our reader via
IndexReader.reopen() rather than using the IndexWriter.getReader(); After
opening a new reader the old reader is closed.
In the
Yes it does,
The key bit is the part with the termAttribute...
thanks a lot,
cheers,
Aad
On Mon, Jun 7, 2010 at 4:53 PM, Simon Willnauer
wrote:
> Hey there,
> in lucene 3.0 / 2.9 the Token class has been remove / replaced with an
> Attribute based API. A TokenStream operates on Attibutes it ha
Hey there,
in lucene 3.0 / 2.9 the Token class has been remove / replaced with an
Attribute based API. A TokenStream operates on Attibutes it has
declared which are eventually accessed by the IndexWriter to create
the inverted index. There are Attributes like TermAttribute,
PositionIncrementAttribu
Hi All,
We are mixing Lucene with a commercial service giving us all kinds of
synonyms. We add these synonyms to the index and we can search with
them. The problem we have is 'highlighting' the orginal word when a
synonym is found.
We were thinking along the following approach.
1. Get a term
2.
Hi All,
Years ago we implemented a Lucene solution which we are updating
today, and i am a bit lost on the following.
In Lucene 1.x and 2.x it was possible to add a token in a Filter
simply by returning an extra Token when next was being called. What i
can not find is an equivalent possiblity for
Your understanding is correct. There is no way to just add a new
field to an existing document, or to update one field in an existing
document.
--
Ian.
On Mon, Jun 7, 2010 at 1:46 PM, andynuss wrote:
>
> Hi,
>
> I want to add a rank field to my index with numbers 1 thru 10, and apply a
> boos
Hi,
I want to add a rank field to my index with numbers 1 thru 10, and apply a
boost appropriate for each of the values. One of the other indexed fields
is huge, about 40,000 chars. My understanding is that if I change the new
"rank" field from 1 to 2, the huge field is reindexed. Is there any
>> That's pretty much exactly what I suspected was happening. I've had the
same
>> problem myself on another occasion... out of interest is there any way to
>> force the file closed without flushing?
>
>No, IndexOutput has no such method. We could consider adding one...
That sounds useful in ge
On Mon, Jun 7, 2010 at 6:18 AM, Regan Heath
wrote:
>
> That's pretty much exactly what I suspected was happening. I've had the same
> problem myself on another occasion... out of interest is there any way to
> force the file closed without flushing?
No, IndexOutput has no such method. We could
That's pretty much exactly what I suspected was happening. I've had the same
problem myself on another occasion... out of interest is there any way to
force the file closed without flushing? From memory I tried everything I
could think of at the time but couldn't manage it. Best I could do was
This is a bug in how Lucene handles IOException while closing files.
Look at SegmentMerger's sources, for 2.3.2:
https://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_3_2/src/java/org/apache/lucene/index/SegmentMerger.java
Look at the finally clause in mergeTerms:
} finally {
If you don't want to use the ImDisk software, a small flash drive will do
just as well...
Regan Heath wrote:
>
> Windows XP.
>
> The problem occurs on the local file system, but to replicate it more
> easily I am using http://www.ltr-data.se/opencode.html#ImDisk to mount a
> virtual 10mb dis
16 matches
Mail list logo