On 22/01/2012 14:32, Yigit Turgut wrote:
Hi all,

I have a text file approximately 20mb in size and contains about one
million lines. I was doing some processing on the data but then the
data rate increased and it takes very long time to process. I import
using numpy.loadtxt, here is a fragment of the data ;

0.000006         -0.0004
0.000071         0.0028
0.000079         0.0044
0.000086         0.0104
.
.
.

First column is the timestamp in seconds and second column is the
data. File contains 8seconds of measurement, and I would like to be
able to split the file into 3 parts seperated from specific time
locations. For example I want to divide the file into 3 parts, first
part containing 3 seconds of data, second containing 2 seconds of data
and third containing 3 seconds. Splitting based on file size doesn't
work that accurately for this specific data, some columns become
missing and etc. I need to split depending on the column content ;

1 - read file until first character of column1 is 3 (3 seconds)
2 - save this region to another file
3 - read the file where first characters  of column1 are between 3 to
5 (2 seconds)
4 - save this region to another file
5 - read the file where first characters  of column1 are between 5 to
5 (3 seconds)
6 - save this region to another file

I need to do this exactly because numpy.loadtxt or genfromtxt doesn't
get well with missing columns / rows. I even tried the invalidraise
parameter of genfromtxt but no luck.

I am sure it's a few lines of code for experienced users and I would
appreciate some guidance.

Here's a solution in Python 3:

input_path = "..."
section_1_path = "..."
section_2_path = "..."
section_3_path = "..."

with open(input_path) as input_file:
    try:
        line = next(input_file)

        # Copy section 1.
        with open(section_1_path, "w") as output_file:
            while line[0] < "3":
                output_file.write(line)
                line = next(input_file)

        # Copy section 2.
        with open(section_2_path, "w") as output_file:
            while line[5] < "5":
                output_file.write(line)
                line = next(input_file)

        # Copy section 3.
        with open(section_3_path, "w") as output_file:
            while True:
                output_file.write(line)
                line = next(input_file)
    except StopIteration:
        pass
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to