Yes, this makes sense to me. I think I'll just keep all words, including
stop words, and if performance ever becomes an issue, I'll look at
bigrams again. But I think there's a good chance that I'll never see
significant impact either way.
Thanks guys!
Grant Ingersoll wrote:
Yep, still good r
Yep, still good reasons like I said, but becoming less important as
the hardware, etc. gets faster and cheaper, IMO, especially in the
context of more advanced search capabilities.
On Mar 3, 2008, at 10:49 AM, Mathieu Lecarme wrote:
Not sure, you might want to ask on Nutch. From a strict
Not sure, you might want to ask on Nutch. From a strict language
standpoint, the notion of a stopword in my mind is a bit dubious. If
the word really has no meaning, then why does the language have it to
begin with? In a search context, it has been treated as of minimal
use in the early da
On Mar 3, 2008, at 5:40 AM, John Byrne wrote:
Hi,
I need to use stop-word bigrams, liike the Nutch analyzer, as
described in LIA 4.8 (Nutch Analysis). What I don't understand is,
why does it keep the original stop word intact? I can see great
advantage to being able to search for a combi
Hi,
I need to use stop-word bigrams, liike the Nutch analyzer, as described
in LIA 4.8 (Nutch Analysis). What I don't understand is, why does it
keep the original stop word intact? I can see great advantage to being
able to search for a combination of stop word + real word, but I don't
see th