There is a small problem in your problem formulation and Lucene,  Lucene
 don't count words, you count terms based on an Analyzer that you have
defined during a phase called IndexWriting, such analyzer will tokenize
(which does not means use the white space between the words) a sequence of
strings that are transformed internally on an array of bytes.

This means that Lucene is not designed to perform word count at prior,
unless you define an analyzer that aggregates on your stats (which occurs
during the analysis "step" which means the sequence of string scanning) and
"give up", be define on a terms list.

You could the WC (@
https://www.gnu.org/software/coreutils/manual/html_node/wc-invocation.html)
command if your problem states just a simple word count. There is a manual
for wc that might help you.


On Fri, Mar 28, 2014 at 11:35 AM, Hollow Quincy <hollow.qui...@gmail.com>wrote:

> Hello,
>
> I would like to use Apache *Lucene 4*.x and count words in the string, for
> example:
> "I loved cats, but now I really love dogs" - count "love" word in the
> String - result should be 2.
> I would like to count how many times there was: "give up" in the String as
> well.
>
> I spend a lot of time to resolve that problem, but I really cannot find any
> solution.
> Do you know how to resolve this problem ?
>
> Thanks for help
>

Reply via email to