On Fri, Jan 8, 2010 at 16:27, Jamie wrote:
> Hi Ian / Will
>
> Thanks. Surely, the Porter Stemmer should not stem proper noun's. i.e. it
> could check the capitalization of the first letter of a word and whether or
> not the word is the start of sentence. If so, it could choose not apply any
> ste
On Fri, Jan 8, 2010 at 15:01, Jamie wrote:
> Hi There
>
> We are trying to search for the exact word "Lowe's" across a large set of
> indexed data. Our results include everything with "low" in it. Thus, we are
> receiving a much larger data set that we expected. The data is indexing
> using the an
On Tue, Oct 27, 2009 at 21:21, Jake Mannix wrote:
> On Tue, Oct 27, 2009 at 6:12 PM, Erick Erickson
> wrote:
>
>> Could you go into your use case a bit more? Because I'm confused.
>> Why don't you want your text tokenized? You say you want to search it,
>> which means you have to analyze it.
>
>
ng like this:
Document doc = new Document();
doc.add(new Field("h1", "hello\0world"));
doc.add(new Field("alltext", "hello\0world\0goodnight\0moon"));
I think that makes sense. Comments?
Will
>
> HTH
> Erick
>
>
> On Tue, Oct 27, 2009 at
Hello list,
I have some semi-structured text that has some markup elements, and
I want to put those elements into a separate field so I can search by
them. For example (using HTML syntax):
8< document
Section title
Body content
>8
I can find that the things inside s are "Sect