Re: [Tutor] How to select particular lines from a text

Kent Johnson Sat, 04 Dec 2004 07:10:52 -0800

kumar,

Here is a solution for you. The makeSections() function will iterate through blocks in the file and return each one in turn to the caller.

makeSections() is a generator function - the use of yield makes it one. That means that it returns an iterator that can be used in a for loop. Each time yield is executed it returns a new value to the loop. In this case, the values returned are the contents of each section.

The loop in makeSections just walks through the lines of the input file. It accumulates the lines into a list and looks for special markers. The markers are, a 'Name:' line, to start a new section, and a blank line, to end a section. When it finds a marker it outputs the current section, if there is one, and starts a new one.

Kent

PS this question is much better asked than the last - you clearly stated what 
you want in a simple form.


data = '''
Name:
City:
xxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxx
....................
xxxxxxxxxxxxxxxxxxxx


Name:
City:
xxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxx

'''

import cStringIO    # just for test

def makeSections(f):
    ''' This is a generator function. It will return successive sections
        of f until EOF.

        Sections are every line from a 'Name:' line to the first blank line.
        Sections are returned as a list of lines with line endings stripped.
    '''

    currSection = []

    for line in f:
        line = line.strip()
        if line == 'Name:':
            # Start of a new section
            if currSection:
                yield currSection
                currSection = []
            currSection.append(line)

        elif not line:
            # Blank line ends a section
            if currSection:
                yield currSection
                currSection = []

        else:
            # Accumulate into a section
            currSection.append(line)

    # Yield the last section
    if currSection:
        yield currSection


f = cStringIO.StringIO(data)

for section in makeSections(f):
    print 'Section'
    for line in section:
        print '   ', line
    print

kumar s wrote:

Dear group, This is continuation to my previous email with sugject line "Python regular expression". My text file although, looks like .ini file, but it is not. It is a chip definition file from Gene chip. it is a huge file with over 340,000 lines.
I have particular set of question in general not
related to that file:
Exmple text:
Name:
City:
xxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxx
....................
xxxxxxxxxxxxxxxxxxxx
Name:
City:
xxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxx
Characterstics of this text: 1. This text is divided into blocks and every block start with 'Name'. The number of lines after this identifier is random.
In this particular case how a particular logic I can
think of to extract some of these blocks is:
1.write a reg.exp to identify the Name identifier one
need.
2. based on the this, ask the program to select all
lines after that until it hits either a new line OR
another name identifier:
My question:
How can I tell my program these 2 conditions:
1. mark the identifier i need and select all the lines after that identifier until it hits a new line or another name identifier.

please englihten me with your suggestions.

thank you.
kumar
__________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail _______________________________________________ Tutor maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] How to select particular lines from a text

Reply via email to