it's alread done.
You can read this book for more information:
http://www-nlp.stanford.edu/IR-book/
On Mon, Dec 14, 2009 at 1:37 PM, DHIVYA M wrote:
> Hi all,
>
> Am using lucene 2.3.1.
> Can anyone suggest me how to implement ranking in lucene? If its available
> how is it done?
>
> Thanks in ad
Hi all,
Am using lucene 2.3.1.
Can anyone suggest me how to implement ranking in lucene? If its available how
is it done?
Thanks in advance,
Dhivya
The INTERNET now has a personality. YOURS! See your Yahoo! Homepage.
http://in.yahoo.com/
Hi, guys,
1. how to deal with c++c++ or c++abc using MappingCharFilter
i use a NormalizeMap("c++","cplusplus"), the analyzed result will be
cpluspluscplusplus or cplusplusabc wich is not what i want
if i use a NormalizeMap("c++","cplusplus$"), the offset will not be correct.
2. I use PatternRe
Does CJK support phrase slop? (I'm assuming no)
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
The offset is incorrect for PatternReplaceCharFilter so the hilighting
result is wrong.
How to fix it?
On Mon, Dec 14, 2009 at 11:43 AM, Weiwei Wang wrote:
> All solr souce downloaded, and I found PatternReplaceCharFilter is very
> useful for my project.
>
> Thanks
>
>
> On Mon, Dec 14, 2009 at
All solr souce downloaded, and I found PatternReplaceCharFilter is very
useful for my project.
Thanks
On Mon, Dec 14, 2009 at 11:14 AM, Weiwei Wang wrote:
> I need the source file not the patch file, where can i download it?
>
>
> On Mon, Dec 14, 2009 at 1:15 AM, Koji Sekiguchi wrote:
>
>> Koji
I need the source file not the patch file, where can i download it?
On Mon, Dec 14, 2009 at 1:15 AM, Koji Sekiguchi wrote:
> Koji Sekiguchi wrote:
>
>> Paul Taylor wrote:
>>
>>> I want my search to treat 'No. 1' and 'No.1' the same, because in our
>>> context its one token I want 'No. 1' to beco
Koji Sekiguchi wrote:
Paul Taylor wrote:
I want my search to treat 'No. 1' and 'No.1' the same, because in our
context its one token I want 'No. 1' to become 'No.1', I need to do
this before tokenizing because the tokenizer would split one value
into two terms and one into just one term. I al
Paul Taylor wrote:
I want my search to treat 'No. 1' and 'No.1' the same, because in our
context its one token I want 'No. 1' to become 'No.1', I need to do
this before tokenizing because the tokenizer would split one value
into two terms and one into just one term. I already use a
NormalizeM
Another problem
how to deal with c++c++ or c++abc using MappingCharFilter
i use a NormalizeMap("c++","cplusplus"), the analyzed result will be
cpluspluscplusplus or cplusplusabc wich is not what i want
if i use a NormalizeMap("c++","cplusplus$"), the offset will not be correct.
On Sun, Dec 13
Thank you very much, Uwe, I found the problem.
2009/12/13 Uwe Schindler
> MappingCharFilter definitely preserves the offsets from the original
> reader.
> Yo can verify that for your case with Lucene’s testcase
> TestMappingCharFilter in the source distribution @
> /src/test/org/apache/lucene/an
MappingCharFilter definitely preserves the offsets from the original reader.
Yo can verify that for your case with Lucene’s testcase
TestMappingCharFilter in the source distribution @
/src/test/org/apache/lucene/analysis/TestMappingCharFilter.java:
public void test2to4() throws Exception {
CharS
LowercaseCharFilter is necessary, as in the MappingCharFilter we need to
provide a NormalizeCharMap. We lowercase the stream so as we only provide
lowercase maps in the NormalizeCharMap, e.g. we provide map
(c++-->cplusplus) instead of (c++-->cplusplus) and (C++-->cplusplus).
C++ is only an exampl
I think your problem is theLowercaseCharFilter that does not pass
correctOffset() to the underying CharFilter. Does it work better without
your LowerCaseCharFilter (which is duplicate because there is already a
LowerCaseFilter in the Tokenizer chain).
As you are only looking for "c++", just also a
gotcha, thanks, Mike
On Sun, Dec 13, 2009 at 7:28 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> IndexWriter is transactional: if you do the deletes & adds during a
> single IndexWriter session (ie, no commit in between), then simply
> call IndexWriter.rollback to get back to your s
IndexWriter is transactional: if you do the deletes & adds during a
single IndexWriter session (ie, no commit in between), then simply
call IndexWriter.rollback to get back to your starting index, if
anything goes wrong. If nothing goes wrong, call IndexWriter.commit.
Outside readers, even newly
thanks, Uwe.
Maybe i was not very clear. My situation is like this:
Analyzer:
NormalizeCharMap RECOVERY_MAP = new NormalizeCharMap();
RECOVERY_MAP.add("c++","cplusplus$");
CharFilter filter = new LowercaseCharFilter(reader);
filter = new RosaMappingCharFilter(RECOVERY_MAP,filter);
MappingCharFilter preserves the offsets in the stream *before* filtering. So
if you store the original string (without c++ replaced) in a stored field
you can highlight using the given offstes. The highlighter must use again
the same analyzer or use FastVectorHighlighter.
-
Uwe Schindler
H.-H.
Problem solved. Now another problem comes.
As I want to use Highlighter in my system, the token offset is incorrect
after the MappingCharFilter is used.
Koji, do you known how to fix the offset problem?
On Sun, Dec 13, 2009 at 11:12 AM, Weiwei Wang wrote:
> I use Luke to check the result and
19 matches
Mail list logo