Comparing modified elements in Sets

2007-07-09 Thread ChrisEdgemon
I've got a set with contents like: [filename1.mp3, filename2.mp3,
filename3.mp3, ...]
I've got another set with contents like [filename1.ogg,
filename3.ogg, ...]
And another set with contents like [filename1, filename 2, ...]

I'd like to be able to compare set 1 with set 2 and have it match
filename1 and filename3, or compare set 1 with 3 and get back
filename1, filename2.  etc.

Is there a way for me to do this inside the compare function, rather
than having to make duplicate copies of each set?

-- 
http://mail.python.org/mailman/listinfo/python-list


Where does str class represent its data?

2007-07-11 Thread ChrisEdgemon
I'd like to implement a subclass of string that works like this:

>>>m = MyString('mail')
>>>m == 'fail'
True
>>>m == 'mail'
False
>>>m in ['fail', hail']
True

My best attempt for something like this is:

class MyString(str):
  def __init__(self, seq):
if self == self.clean(seq): pass
else: self = MyString(self.clean(seq))

  def clean(self, seq):
seq = seq.replace("m", "f")

but this doesn't work.  Nothing gets changed.

I understand that I could just remove the clean function from the
class and call it every time, but I use this class in several
locations, and I think it would be much safer to have it do the
cleaning itself.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Where does str class represent its data?

2007-07-12 Thread ChrisEdgemon
On Jul 11, 9:49 pm, James Stroud <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > I'd like to implement a subclass of string that works like this:
>
> m = MyString('mail')
> m == 'fail'
>
> > True
>
> m == 'mail'
>
> > False
>
> m in ['fail', hail']
>
> > True
>
> > My best attempt for something like this is:
>
> > class MyString(str):
> >   def __init__(self, seq):
> > if self == self.clean(seq): pass
> > else: self = MyString(self.clean(seq))
>
> >   def clean(self, seq):
> > seq = seq.replace("m", "f")
>
> > but this doesn't work.  Nothing gets changed.
>
> > I understand that I could just remove the clean function from the
> > class and call it every time, but I use this class in several
> > locations, and I think it would be much safer to have it do the
> > cleaning itself.
>
> The "flat is better than nested" philosophy suggests that clean should
> be module level and you should initialize a MyString like such:
>
>m = MyString(clean(s))
>
> Where clean is
>
>def clean(astr):
>  return astr.replace('m', 'f')
>
> Although it appears compulsory to call clean each time you instantiate
> MyString, note that you do it anyway when you check in your __init__.
> Here, you are explicit. Such an approach also eliminates the obligation
> to clean the string under conditions where you know it will already be
> clean--such as deserialization.

Initially, I tried simply calling a clean function on a regular
string, without any of this messy subclassing.  However, I would end
up accidentally cleaning it more than once, and transforming the
string was just very messy.  I thought that it would be much easier to
just clean the string once, and then add methods that would give me
the various transformations that I wanted from the cleaned string.
Using __new__ seems to be the solution I was looking for.

>
> Also, you don't return anything from clean above, so you assign None to
> self here:
>
> self = MyString(self.clean(seq))
>
> Additionally, it has been suggested that you use __new__. E.g.:
>
> py> class MyString(str):
> ...   def __new__(cls, astr):
> ... astr = astr.replace('m', 'f')
> ... return super(MyString, cls).__new__(cls, astr)
> ...
> py> MyString('mail')
> 'fail'
>
> But this is an abuse of the str class if you intend to populate your
> subclasses with self-modifying methods such as your clean method. In
> this case, you might consider composition, wherein you access an
> instance of str as an attribute of class instances. The python standard
> library make this easy with the UserString class and the ability to add
> custom methods to its subclasses:

What constitutes an abuse of the str class?  Is there some performance
decrement that results from subclassing str like this?  (Unfortunately
my implementation seems to have a pretty large memory footprint, 400mb
for about 400,000 files.) Or do you just mean from a philsophical
standpoint?  I guess I don't understand what benefits come from using
UserString instead of just str.

Thanks for the help,
Chris

>
> py> from UserString import UserString as UserString
> py> class MyClass(UserString):
> ...   def __init__(self, astr):
> ... self.data = self.clean(astr)
> ...   def clean(self, astr):
> ... return astr.replace('m', 'f')
> ...
> py> MyClass('mail')
> 'fail'
> py> type(_)
> 
>
> This class is much slower than str, but you can always access an
> instance's data attribute directly if you want fast read-only behavior.
>
> py> astr = MyClass('mail').data
> py> astr
> 'fail'
>
> But now you are back to a built-in type, which is actually the
> point--not everything needs to be in a class. This isn't java.
>
> James
>
> --
> James Stroud
> UCLA-DOE Institute for Genomics and Proteomics
> Box 951570
> Los Angeles, CA 90095
>
> http://www.jamesstroud.com/


-- 
http://mail.python.org/mailman/listinfo/python-list