It's a bit of a hack, but we do this:
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="([A-Za-z])\+\+" replacement="$1plusplus" />
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="([A-Za-z])\#" replacement="$1sharp" />
On 1/28/2015 2:00 AM, Shivashankar Maddanimath wrote:
Hi,
I am using Lucene standard and uax29urlemailtokenizer. These analysers are excluding
some characters like "+" ( I can't search C++). Is there any way we can
configure analyzers to include specific characters in analyzers while tokenising?
Regards,
Shiv
-----Original Message-----
From: "Luis A Lastras" <lastr...@us.ibm.com>
Sent: 25-01-2015 08:05 AM
To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>
Subject: Absolute term position in scoring
Is it possible to incorporate in Lucene's scoring function the position of a
matching term (say as measured from the top of the document). The scenario is,
if the set of documents tend to lk about the most important stuff at the
beginning of the document, then we would like to give preference to documents
that mention a term close to the top.
Thanks,
Luis
Luis A Lastras, Ph.D.
Research Staff Member & Manager, Concept Analytics, IBM Watson
Member of the iBM Academy of Technology
IBM Master Inventor
email: lastr...@us.ibm.com | Tel: 914-945-3613 | Cell: 914-382-1879
address: 1101 Kitchawan Rd, Office 28-132, Yorktown Heights, NY, 10598
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org