Re: Searching for uniqness in a list of data

Paul McGuire Wed, 01 Mar 2006 10:31:26 -0800

"rh0dium" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Hi all,
>
> I am having a bit of difficulty in figuring out an efficient way to
> split up my data and identify the unique pieces of it.
>
> list=['1p2m_3.3-1.8v_sal_ms','1p2m_3.3-1.8_sal_log']
>
> Now I want to split each item up on the "_" and compare it with all
> others on the list, if there is a difference I want to create a list of
> the possible choices, and ask the user which choice of the list they
> want.
<snip>


Check out difflib.

>>> data=['1p2m_3.3-1.8v_sal_ms','1p2m_3.3-1.8_sal_log']
>>> data[0].split("_")
['1p2m', '3.3-1.8v', 'sal', 'ms']
>>> data[1].split("_")
['1p2m', '3.3-1.8', 'sal', 'log']
>>> from difflib import SequenceMatcher
>>> s = SequenceMatcher(None, data[0].split("_"), data[1].split("_"))
>>> s.matching_blocks
[(0, 0, 1), (2, 2, 1), (4, 4, 0)]

I believe one interprets the tuples in matching_blocks as:
(seq1index,seq2index,numberOfMatchingItems)

In your case, the sequences have a matching element 0 and matching element
2, each of length 1.  I don't fully grok the meaning of the (4,4,0) tuple,
unless this is intended to show that both sequences have the same length.

Perhaps from here, you could locate the gaps in the
SequenceMatcher.matching_blocks property, and prompt for the user's choice.

-- Paul


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Searching for uniqness in a list of data

Reply via email to