> Should we check for stop words before stemming or after ?
Current implementation supports both variants. Look dictionary interface
definition in morph.c:
typedef struct
{
charlocalename[NAMEDATALEN];
/* init dictionary */
void *(*init) (void);
On Fri, 6 Sep 2002, Christopher Kings-Lynne wrote:
> > Should we check for stop words before stemming or after ?
>
> I think you should.
>
> > In the first case we have to collect all forms of stop-words
> > which is doable
> > but difficult to maintain, in latter - we'll have current problem.
>
On Fri, 6 Sep 2002, Christopher Kings-Lynne wrote:
> > Looking at the list of stopwords you sent me, Oleg, there are only about 1
> > out of the list of 120 stopwords that need to have all word forms
> > added. I
> > also don't think it'll be a maintenance problem. The reason I
> > think this i
probably we could enhance our parser to
handle such words too.
Anyway, most problems just a question of time we don't have :-(
>
> Chris
>
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED]]On Behalf Of Christopher
> > Ki
. wasn't, isn't, it's, etc.?
Chris
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Christopher
> Kings-Lynne
> Sent: Friday, 6 September 2002 12:20 PM
> To: Christopher Kings-Lynne; Oleg Bartunov
> Cc: Hackers; [EMAI
> Looking at the list of stopwords you sent me, Oleg, there are only about 1
> out of the list of 120 stopwords that need to have all word forms
> added. I
> also don't think it'll be a maintenance problem. The reason I
> think this is
> because stopwords in general don't have different word f
> Should we check for stop words before stemming or after ?
I think you should.
> In the first case we have to collect all forms of stop-words
> which is doable
> but difficult to maintain, in latter - we'll have current problem.
Looking at the list of stopwords you sent me, Oleg, there are onl
On Thu, 5 Sep 2002, Martin Porter wrote:
>
> Oleg,
>
> The Porter stemming stems herring and herrings to her, which is a bit
> unfortunate. A quick fix is to put 'herring/herrings' in the exception list
> in the english (porter2) stemmer, but I'll look at this case over the next
> few days and se
Oleg,
The Porter stemming stems herring and herrings to her, which is a bit
unfortunate. A quick fix is to put 'herring/herrings' in the exception list
in the english (porter2) stemmer, but I'll look at this case over the next
few days and see if I can come up with something a bit better.
Inter
On Thu, 5 Sep 2002, Christopher Kings-Lynne wrote:
> Hmmm...thinking about it, maybe 'herring' is being reduced to 'her' after
> the stemming process and hence is thought to be a stopword? This is a bug,
> but how should it be fixed?
>
It's difficult question how to use stop words. We'll see wh
Hmmm...thinking about it, maybe 'herring' is being reduced to 'her' after
the stemming process and hence is thought to be a stopword? This is a bug,
but how should it be fixed?
Although, tests don't support that:
usa=# select food_id, brand,description,ftiidx from food_foods where ftiidx
## 'hi
11 matches
Mail list logo