My index contains multivalued filed like and i use whitespaceAnalyzer
DOC 1 : ITEMNAME: item 2 name
ITEMNAME: movie tickets
ITEMNAME: item 1 name
so when search for (+ITEMNAME:item +ITEMNAME:movie), it should not match any
document
since there is no field which has both item
Hi,
Continuing my question - I now suspect a bug in Lucene 3.0.3, because I ran the
test with Lucene 3.0.0 and it worked okay (no junk files)... could anyone
please confirm?
--- On Mon, 1/10/11, sol myr wrote:
From: sol myr
Subject: Newbie question: optimized files?
To: java-user@lucene.apac
Your problem is more with tika. Pls post in tika user group.
If you want to deal with only HTML then better use html parser.
http://www.findbestopensource.com/search/?query=%22html+parser%22
On Tue, Jan 11, 2011 at 7:24 AM, amg qas wrote:
> I have been trying to parse & index different portion
I have been trying to parse & index different portions of an HTML page using
Tika & Lucene. For eg. I would like to index text within , ,
, tags
of a HTML page separately and provide a different boost to each of them. I
am using Tika for HTML parsing and creating a Document object with the
appropr
On Mon, Jan 10, 2011 at 7:44 PM, Ryan Aylward wrote:
> We do leverage synonyms but they are not appropriate for this case. We use
> synonyms for words that are truly synonymous for the entire index such as
> "inc" and "incorporated". Those words are always interchangeable. However,
> many of the
> We do leverage synonyms but they are not appropriate for
> this case. We use synonyms for words that are truly
> synonymous for the entire index such as "inc" and
> "incorporated". Those words are always interchangeable.
> However, many of the employer alternate names are only valid
> for a singl
Hi,
I'm new to Lucene (using 3.0.3), and just started to check out the behavior of
the 'optimize()' method (which is quite important for our application).
Could it be that 'optimize' cancels out the 'compoundFile' mode? Or am I doing
something wrong?
Here's my test: I create an indexWriter wi
Thanks for the response.
We do leverage synonyms but they are not appropriate for this case. We use
synonyms for words that are truly synonymous for the entire index such as "inc"
and "incorporated". Those words are always interchangeable. However, many of
the employer alternate names are only
Thanks Raf.
On Sun, Jan 9, 2011 at 1:20 AM, Raf wrote:
> On Sat, Jan 8, 2011 at 7:24 PM, Raavan wrote:
>
> > Also, just for my understanding, is SortedVIntList able to perform some
> > operations such as AND/OR without decompression ?
> >
>
> No, not natively:
>
> http://lucene.apache.org/java/