Hi,
Does any one know how to tokenize a string in python that returns the
byte offsets and tokens? Moreover, the sentence splitter that returns
the sentences and byte offsets? Finally n-grams returned with byte
offsets.
Input:
This is a string.
Output:
This 0
is 5
a 8
string. 10
On Aug 6, 10:49 am, "Gabriel Genellina"
wrote:
> En Fri, 06 Aug 2010 06:07:32 -0300, Muhammad Adeel
> escribió:
>
> > Does any one know how to tokenize a string in python that returns the
> > byte offsets and tokens? Moreover, the sentence splitter that ret
Hi,
Does any one about any implementation of classical Smith Waterman
local alignment algorithm and it's variants for aligning natural
language text?
thanks
--
http://mail.python.org/mailman/listinfo/python-list