On Tue, 16 Feb 2010 23:48:17 -0000, Imaginationworks <xiaju...@gmail.com> wrote:

Hi,

I am trying to read object information from a text file (approx.
30,000 lines) with the following format, each line corresponds to a
line in the text file.  Currently, the whole file was read into a
string list using readlines(), then use for loop to search the "= {"
and "};" to determine the Object, SubObject,and SubSubObject. My
questions are

1) Is there any efficient method that I can search the whole string
list to find the location of the tokens(such as '= {' or '};'

The usual idiom is to process a line at a time, which avoids the memory overhead of reading the entire file in, creating the list, and so on. Assuming your input file is laid out as neatly as you said, that's straightforward to do:

for line in myfile:
    if "= {" in line:
        start_a_new_object(line)
    elif "};" in line:
        end_current_object(line)
    else:
        add_stuff_to_current_object(line)

You probably want more robust tests than I used there, but that depends on how well-defined your input file is. If it can be edited by hand, you'll need to be more defensive!

2) Is there any efficient ways to extract the object information you
may suggest?

That depends on what you mean by "extract the object information". If you mean "get the object name", just split the line at the "=" and strip off the whitespace you don't want. If you mean "track how objects are connected to one another, have each object keep a list of its immediate sub-objects (which will have lists of their immediate sub-objects, and so on); it's fairly easy to keep track of which objects are current using a list as a stack. If you mean something else, sorry but my crystal ball is cloudy tonight.

--
Rhodri James *-* Wildebeeste Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to