> search for 'efghmnop'
> in 'abcdefghijklmnopqrstuvwxyzabcdefghmnop'
>
> Take the last letter of the searched for substring, p.
>
> Pick a possible substring endpoint in the large string.
> This starts out at an offset from the beginning of the
> large string. The offset is the length o
> Like I said I know I can use the module Similarity. But in
> order to do this I would need bot the query and the subject
> string. And to get the subject string I would need to 'slide'
> down the larger string and pull out all combinations 1 by 1.
> This is very slow with a 4.5 million character
Hello again,
I have no background in genetic analysis but it looks like there is so
much effort going on in the Bio:: modules. There is a module called
Bio::SeqFeature::Similarity that might be doing just what you want. But
then again, it may not :-)
Hope this helps,,,
Aziz,,,
In article <
Hello again,
I would suggest you look at Algorithm::Diff module (available at CPAN).
The function LCS, given 2 strings, gives you the "longest common
sequence" between the 2 strings. Once you have the longest common
sequence, you can probably decide whether it meets the 80% criterion you
set or
Aziz,
I guess I hadn't thought about it that way, so here is more info.
What I'm basically doing is randomly pulling a string of 500 from one string
and looking for it in another string. So I'm looking for a substring of the
larger string that matches my query string. In terms of how it matches
Hello,
I don't have a direct answer for your question since your question is a
little bit ambigious; let me explain:
Do you want to search for a substring in a long string, or you want true
regexp match?
If you want a true regexp match, then the question is even more
ambigious. For example, th
Hello,
I'm working on a program where I am searching for a short string within a
longer string. The catch is that the long string is about 4.5 million chars
long and the short string is about 500. Using a regex to do an exact match is
simple, but what if I want just a close match, like 80% or wha