On 6 Feb 2006 09:03:09 -0800, "Ernesto" <[EMAIL PROTECTED]> wrote:
>I'm still fairly new to python, so I need some guidance here... > >I have a text file with lots of data. I only need some of the data. I >want to put the useful data into an [array of] struct-like >mechanism(s). The text file looks something like this: > >[BUNCH OF NOT-USEFUL DATA....] > >Name: David >Age: 108 Birthday: 061095 SocialSecurity: 476892771999 > >[MORE USELESS DATA....] > >Name........ Does the useful data always come in fixed-format pairs of lines as in your example? If so, you could just iterate through the lines of your text file as in example at end [1] > >I would like to have an array of "structs." Each struct has > >struct Person{ > string Name; > int Age; > int Birhtday; > int SS; >} You don't normally want to do real structs in python. You probably want to define a class to contain the data, e.g., class Person in example at end [1] > >I want to go through the file, filling up my list of structs. > >My problems are: > >1. How to search for the keywords "Name:", "Age:", etc. in the file... >2. How to implement some organized "list of lists" for the data >structure. > It may be very easy, if the format is fixed and space-separated and line-paired as in your example data, but you will have to tell us more if not. [1] exmaple: ----< ernesto.py >--------------------------------------------------------- class Person(object): def __init__(self, name): self.name = name def __repr__(self): return 'Person(%r)'%self.name def extract_info(lineseq): lineiter = iter(lineseq) # normalize access to lines personlist = [] for line in lineiter: substrings = line.split() if substrings and isinstance(substrings, list) and substrings[0] == 'Name:': try: name = ' '.join(substrings[1:]) # allow for names with spaces line = lineiter.next() age_hdr, age, bd_hdr, bd, ss_hdr, ss = line.split() assert age_hdr=='Age:' and bd_hdr=='Birthday:' and ss_hdr=='SocialSecurity:', \ 'Bad second line after "Name: %s" line:\n %r'%(name, line) person = Person(name) person.age = int(age); person.bd = int(bd); person.ss=int(ss) personlist.append(person) except Exception,e: print '%s: %s'%(e.__class__.__name__, e) return personlist def test(): lines = """\ [BUNCH OF NOT-USEFUL DATA....] Name: David Age: 108 Birthday: 061095 SocialSecurity: 476892771999 [MORE USELESS DATA....] Name: Ernesto Age: 25 Birthday: 040181 SocialSecurity: 123456789 Name: Ernesto Age: 44 Brithdy: 040106 SocialSecurity: 123456789 Name........ """ persondata = extract_info(lines.splitlines()) print persondata ssdict = {} for person in persondata: if person.ss in ssdict: print 'Rejecting %r with duplicate ss %s'%(person, person.ss) else: ssdict[person.ss] = person print 'ssdict keys: %s'%ssdict.keys() for ss, pers in sorted(ssdict.items(), key=lambda item:item[1].name): #sorted by name print 'Name: %s Age: %s SS: %s' % (pers.name, pers.age, pers.ss) if __name__ == '__main__': test() --------------------------------------------------------------------------- this produces output: [10:07] C:\pywk\clp>py24 ernesto.py AssertionError: Bad second line after "Name: Ernesto" line: 'Age: 44 Brithdy: 040106 SocialSecurity: 123456789' [Person('David'), Person('Ernesto')] ssdict keys: [123456789, 476892771999L] Name: David Age: 108 SS: 476892771999 Name: Ernesto Age: 25 SS: 123456789 if you want to try this on a file, (we'll use the source itself here since it includes valid example data lines), do something like: >>> import ernesto >>> info = ernesto.extract_info(open('ernesto.py')) AssertionError: Bad second line after "Name: Ernesto" line: 'Age: 44 Brithdy: 040106 SocialSecurity: 123456789\n' >>> info [Person('David'), Person('Ernesto')] tweak to taste ;-) Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list