I'm fiddling with custom anaylyzers to analyze email addresses to store the full
email address and the component parts. It's based on Solr's analyzer framework,
so I have a StandardTokenizerFactory followed by a EmailFilterFactory. It produces
Analyzing "<[EMAIL PROTECTED]>"
1: [EMAIL PROTEC
Thanks, make sense. Just another question about the memoryIndex. In your
example you said I can do memoryIndex. getReader().terms(); but in fact there
is no public access to the reader from memory index...
If this is not possible, I will list the docs terms while I'm indexing.
Mélanie
-O
>>I just want to make sure there is no API either
No, but your code looks like it should do the job. That code can be
improved by something like [psuedo code]:
query.extractTerms(terms);
if(query instanceof PhraseQuery)
{
//find and index rarest term only using an existing index
Mark,
When, I extract the terms from my query, I can not use add them directly? I
have to do something like:
Set terms=new HashSet();
query.extractTerms(terms);
Document doc=new Document();
for(Term term:terms){
doc.add(new
Field(term.field(),term.text(),Field.Store.NO,Field.Index.TOKENIZED);
}
Roger Keays wrote:
Hi there,
I'm trying to delete a single document by using its uuid field:
uuid = new Term("uuid", item.getUuid().toString());
writer.deleteDocuments(uuid);
writer.close();
However, it appears that this operation is deleting *every* document,
whether the uuid mat
Hi Roger,
The method usage seems correct to me. Are you saying that search with
TermQuery(Term("uuid","76")) returns only one of many existing documents,
but deleteDocuments(Term("uuid","76")) deletes all docs? (also docs not
returned by the search for this term?) Could you send here a small progr
Hi there,
I'm trying to delete a single document by using its uuid field:
uuid = new Term("uuid", item.getUuid().toString());
writer.deleteDocuments(uuid);
writer.close();
However, it appears that this operation is deleting *every* document,
whether the uuid matches or not. The uui
I haven't had to do anything. All the replies I do just magically get to the
correct list Not helpful I know, but I'm lazy ..
Erick
On 3/27/07, Lukas Vlcek <[EMAIL PROTECTED]> wrote:
Eric,
How do you manage Reply-to: field in your gmail? I always have to change
Reply-to field in Setting (
Eric,
How do you manage Reply-to: field in your gmail? I always have to change
Reply-to field in Setting (which requires more then three clicks!) and since
this is a manual (and tedious) process it can introduce mistakes
(mis-addressed addresses). The problem is that I am signed up to more
mail-l
Assuming you don't mean UI design - how about a small auxiliary sponsor
index containing sponsor data - doc per sponsor, sponsor text and sponsor
url as stored fields, sponsor doc statically boosted by sponsor's
$importance$, and highlighting of user query words in the excerpt from
suggested sponso
Howdy,
Does anyone have any design considerations for implementing
a contextual text-link advertising system using Lucene?
The emphasis would be strictly on monetizing search results with
light, non-intrusive behavior (query terms match sponsored results).
Thanks,
Peter W.
--
See below...
On 3/27/07, daveburns <[EMAIL PROTECTED]> wrote:
Hi,
afriad I'm a noobie at Luncene but read Otis/Eriks book and was hoping
someone can answer a quick question on the AliasAnalyzer (Chap 4). I want
to
build a search for names (Companies/surname, firstname etc) but need to
match th
Hi Tim,
From the StandardAnalyzer code, the TokenStream looks like:
/** Constructs a [EMAIL PROTECTED] StandardTokenizer} filtered by a [EMAIL
PROTECTED]
StandardFilter}, a [EMAIL PROTECTED] LowerCaseFilter} and a [EMAIL PROTECTED]
StopFilter}. */
public TokenStream tokenStream(String fi
On 3/27/07, sandeep chawla <[EMAIL PROTECTED]> wrote:
Well in any case..
is there a implemention of Porter2 Stemming algorithim in java..
I dont want to make a snowballfilter based on snowball English Stemmer.
You mean you don't want to use the snowball lucene-contrib package ? Why not?
-Y
Thanks for the quick reply
I'm using the synonym engine from LIA for both parsing queries and building
the index. Do you have the code for a synonym engine that would work for all
matches.
I'm using ver 2.1 of lucene core.
Thanks again
Dave
--
View this message in context:
http://www.nabbl
Actually I don't like well my proposed way of implementation.
I wanna play with score to implement the similar logic as I mentioned in my
solution.
But how?
Any suggestions, I would really appreciate. :)
Jelda
> -Original Message-
> From: Ramana Jelda [mailto:[EMAIL PROTECTED]
> Sent: T
in a synonym Engine...
suppose synonyms of word x is syn(x)...
then if y = syn(x) then x = syn(y) doesn't hold true always .(
you might not get any synonyms of y..it depends on the data of synonym
engine)
so your synonym engine might be providing alias of bob as robert,
rob, bobby...
Gmail has been good to me for this list...
Erick
On 3/27/07, karl wettin <[EMAIL PROTECTED]> wrote:
27 mar 2007 kl. 08.28 skrev Mohammad Norouzi:
> Karl,
> Maybe I am out of date!
> do you mean with Nabble I can access this mailing list?
Yes.
--
karl
>
> On 3/27/07, karl wettin <[EMAIL P
Hi,
afriad I'm a noobie at Luncene but read Otis/Eriks book and was hoping
someone can answer a quick question on the AliasAnalyzer (Chap 4). I want to
build a search for names (Companies/surname, firstname etc) but need to
match thing s like
Robert= bob, bobby, rob etc (or margaret=peggy etc).
sorry I cant comprehend, so why we should use two separate index? we can
merge it in one index file?
On 3/27/07, Steven Rowe <[EMAIL PROTECTED]> wrote:
Mohammad Norouzi wrote:
> Steven,
> what this means:
> "Each index added must have the same number of documents, but
> typically each contains
Mohammad Norouzi wrote:
> Steven,
> what this means:
> "Each index added must have the same number of documents, but
> typically each contains different fields. Each document contains the
> union of the fields of all documents with the same document number.
> When searching, matches for a query ter
Well in any case..
is there a implemention of Porter2 Stemming algorithim in java..
I dont want to make a snowballfilter based on snowball English Stemmer.
On 27/03/07, thomas arni <[EMAIL PROTECTED]> wrote:
Write your own analyzer, which calls the appropriate Filter in the
method "tokenStre
Write your own analyzer, which calls the appropriate Filter in the
method "tokenStream".
In the method "tokenStream" you can define, how the input should be
analyzed and parsed.
Your analyzer must extend the abstract class Analyzer. The easiest way
is to create a new class (Analyzer), which
Hi,
Lucene provides a PorterStemFilter which uses PorterStemmer.
Is there any way I can use a PorterStemFilter ( by extending it or
something) which uses porter2 stemming algorithm not the original porter
algorithm.
I know , this is possible using snowball filter but for some reason I
d
Thanks for all your help.
Here I am coming with the best solution I can see and I am planning to
implement this.
Suppose 20 unique customers && 90,000 results found && to be returned offset
results 0-20
I can think of only following solution..
//Hope pseudo code is self understandable..
Public
Steven,
what this means:
"Each index added must have the same number of documents, but typically each
contains different fields. Each document contains the union of the fields of
all documents with the same document number. When searching, matches for a
query term are from the first index added th
26 matches
Mail list logo