Albert-Jan Roskam wrote: > > > p, li { white-space: pre-wrap; } > > Hi, > First, before I forget, emails from hotmail/yahoo etc appear to end up in > the spam folder these days, so apologies in advance if I do not appear to > follow up to your replies. Ok, now to my question. I want to create a > class with read-only attribute access to the columns of a .csv file. E.g. > when a file has a column named 'a', that column should be returned as list > by using instance.a. At first I thought I could do this with the builtin > 'property' class, but I am not sure how. I now tried to use descriptors > (__get__ and __set__), which are also used by ' property' (See also: > https://docs.python.org/2/howto/descriptor.html). > > In the " if __name__ == '__main__'" section, [a] is supposed to be a > shorthand for == equivalent to [b]. But it's not.I suspect it has to do > with the way attributes are looked up. So once an attribute has been found > in self.__dict__ aka "the usual place", the search stops, and __get__ is > never called. But I may be wrong. I find the __getatttribute__, > __getattr__ and __get__ distinction quite confusing. What is the best > approach to do this? Ideally, the column values should only be retrieved > when they are actually requested (the .csv could be big). Thanks in > advance! > > > > import csv > from cStringIO import StringIO > > > class AttrAccess(object): > > > def __init__(self, fileObj): > self.__reader = csv.reader(fileObj, delimiter=";") > self.__header = self.__reader.next() > #[setattr(self, name, self.__get_column(name)) for name in > #[self.header] > self.a = range(10) > > > @property > def header(self): > return self.__header > > def __get_column(self, name): > return [record[self.header.index(name)] for record in > self.__reader] # generator expression might be better here. > > def __get__(self, obj, objtype=type): > print "__get__ called" > return self.__get_column(obj) > #return getattr(self, obj) > > def __set__(self, obj, val): > raise AttributeError("Can't set attribute") > > if __name__ == " __main__": > f = StringIO("a;b;c\n1;2;3\n4;5;6\n7;8;9\n") > instance = AttrAccess(f) > print instance.a # [a] does not call __get__. Looks, and finds, in > self.__dict__? > print instance.__get__("a") # [b] this is supposed to be equivalent > to [a] > instance.a = 42 # should throw AttributeError!
I think the basic misunderstandings are that (1) the __get__() method has to be implemented by the descriptor class (2) the descriptor instances should be attributes of the class that is supposed to invoke __get__(). E. g.: class C(object): x = decriptor() c = C() c.x # invoke c.x.__get__(c, C) under the hood. As a consequence you need one class per set of attributes, instantiating the same AttrAccess for csv files with differing layouts won't work. Here's how to do it all by yourself: class ReadColumn(object): def __init__(self, index): self._index = index def __get__(self, obj, type=None): return obj._row[self._index] def __set__(self, obj, value): raise AttributeError("oops") def first_row(instream): reader = csv.reader(instream, delimiter=";") class Row(object): def __init__(self, row): self._row = row for i, header in enumerate(next(reader)): setattr(Row, header, ReadColumn(i)) return Row(next(reader)) f = StringIO("a;b;c\n1;2;3\n4;5;6\n7;8;9\n") row = first_row(f) print row.a row.a = 42 Instead of a custom descriptor you can of course use the built-in property: for i, header in enumerate(next(reader)): setattr(Row, header, property(lambda self, i=i: self._row[i])) In many cases you don't care about the specifics of the row class and use collections.namedtuple: def rows(instream): reader = csv.reader(instream, delimiter=";") Row = collections.namedtuple("Row", next(reader)) return itertools.imap(Row._make, reader) f = StringIO("a;b;c\n1;2;3\n4;5;6\n7;8;9\n") row = next(rows(f)) print row.a row.a = 42 _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor