Re: Highlighting html pages

2012-10-23 Thread Michael Sokolov
If you use HTMLStripCharFilter, it extracts the text only, leaving tags out, and remembering the word positions so that highlighting works properly. Should do exactly what you want out of the box... On 10/23/2012 8:00 PM, Scott Smith wrote: I need to take an html page that I retrieve from m

Highlighting html pages

2012-10-23 Thread Scott Smith
I need to take an html page that I retrieve from my lucene search and highlight all of the terms that are part of the search. I need to skip over any html tags since I don't want any words in tags which happen to match the search to be highlighted. Note that I don't want sections of the docum

RE: Scoring based on document

2012-10-23 Thread Siraj Haider
Thanks for the suggestion, but in that scenario, I would lose the ability to search on individual fields, i.e. I would not be able to search on title field only, and would end up with results where the searched term might be in the description field. regards -Siraj (212) 306-0154 -Original

Re: Solr/Lucene + Oracle Database seamless integration

2012-10-23 Thread Maximiliano Keen
The main advantages are full text search capabilities and performance. Oracle Text is a pre-google technology. It doesn't support features like facets, "did you mean", "more like this", spacial search, etc. Besides Oracle text has serious problems when scaling to big volumes of information. You

Re: Solr/Lucene + Oracle Database seamless integration

2012-10-23 Thread Petite Abeille
On Oct 23, 2012, at 10:35 PM, Maximiliano Keen wrote: > Scotas combines and synchronize the high-performance, full-featured > Solr/Lucene text search engine with the industry leading Oracle Database's > performance, scalability, security, and reliability. How does this compares/contrasts to O

Solr/Lucene + Oracle Database seamless integration

2012-10-23 Thread Maximiliano Keen
*Scotas* brings a remarkable advancement to Enterprise Text Search. Scotas combines and synchronize the high-performance, full-featured Solr/Lucene text search engine with the industry leading Oracle Database's performance, scalability, security, and reliability. Main features are: - Full Inte

Re: Scoring based on document

2012-10-23 Thread selvakumar netaji
Hi All, Just wanted to make sure that will approach would fails for this case. Having a copy field for each of the document, having the concatenated values of all the fields in that document and searching on the copy field would just produce the result. The resulting docs would be based on the

RE: Scoring based on document

2012-10-23 Thread Siraj Haider
So, just to confirm, using Lucene 4.0, we would be able to issue a search on one or more fields and would be able to get the results sorted by a custom field and also would be able to get the score of each document based on the frequency of the terms searched in all the indexed fields of that do

Re: Scoring based on document

2012-10-23 Thread Simon Willnauer
hey there, in Lucene 4 you can override the termStatistics / CollectionStatistics used for scoring in the IndexSearcher. You can take multiple fields into account here in order use it for scoring. Here is the javadoc link: http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/IndexSe