Searching Multivalued fileds

2011-01-10 Thread Sailesh
My index contains multivalued filed like and i use whitespaceAnalyzer DOC 1 : ITEMNAME: item 2 name ITEMNAME: movie tickets ITEMNAME: item 1 name so when search for (+ITEMNAME:item +ITEMNAME:movie), it should not match any document since there is no field which has both item

Re: Newbie question: optimized files?

2011-01-10 Thread sol myr
Hi, Continuing my question - I now suspect a bug in Lucene 3.0.3, because I ran the test with Lucene 3.0.0 and it worked okay (no junk files)... could anyone please confirm? --- On Mon, 1/10/11, sol myr wrote: From: sol myr Subject: Newbie question: optimized files? To: java-user@lucene.apac

Re: How to parse & index different portions of an HTML page using Tika & Lucene ?

2011-01-10 Thread findbestopensource
Your problem is more with tika. Pls post in tika user group. If you want to deal with only HTML then better use html parser. http://www.findbestopensource.com/search/?query=%22html+parser%22 On Tue, Jan 11, 2011 at 7:24 AM, amg qas wrote: > I have been trying to parse & index different portion

How to parse & index different portions of an HTML page using Tika & Lucene ?

2011-01-10 Thread amg qas
I have been trying to parse & index different portions of an HTML page using Tika & Lucene. For eg. I would like to index text within , , , tags of a HTML page separately and provide a different boost to each of them. I am using Tika for HTML parsing and creating a Document object with the appropr

Re: Creating an index with multiple values for a single field

2011-01-10 Thread Doron Cohen
On Mon, Jan 10, 2011 at 7:44 PM, Ryan Aylward wrote: > We do leverage synonyms but they are not appropriate for this case. We use > synonyms for words that are truly synonymous for the entire index such as > "inc" and "incorporated". Those words are always interchangeable. However, > many of the

RE: Creating an index with multiple values for a single field

2011-01-10 Thread Ahmet Arslan
> We do leverage synonyms but they are not appropriate for > this case. We use synonyms for words that are truly > synonymous for the entire index such as "inc" and > "incorporated". Those words are always interchangeable. > However, many of the employer alternate names are only valid > for a singl

Newbie question: optimized files?

2011-01-10 Thread sol myr
Hi, I'm new to Lucene (using 3.0.3), and just started to check out the behavior of the 'optimize()' method (which is quite important for our application). Could it be that 'optimize' cancels out the 'compoundFile' mode? Or am I doing something wrong? Here's my test: I create an indexWriter wi

RE: Creating an index with multiple values for a single field

2011-01-10 Thread Ryan Aylward
Thanks for the response. We do leverage synonyms but they are not appropriate for this case. We use synonyms for words that are truly synonymous for the entire index such as "inc" and "incorporated". Those words are always interchangeable. However, many of the employer alternate names are only

Re: is OpenBitSet / SortedVIntList compressed bit map index?

2011-01-10 Thread Raavan
Thanks Raf. On Sun, Jan 9, 2011 at 1:20 AM, Raf wrote: > On Sat, Jan 8, 2011 at 7:24 PM, Raavan wrote: > > > Also, just for my understanding, is SortedVIntList able to perform some > > operations such as AND/OR without decompression ? > > > > No, not natively: > > http://lucene.apache.org/java/