On 22/01/2012 14:32, Yigit Turgut wrote:
Hi all,
I have a text file approximately 20mb in size and contains about one
million lines. I was doing some processing on the data but then the
data rate increased and it takes very long time to process. I import
using numpy.loadtxt, here is a fragment of the data ;
0.000006 -0.0004
0.000071 0.0028
0.000079 0.0044
0.000086 0.0104
.
.
.
First column is the timestamp in seconds and second column is the
data. File contains 8seconds of measurement, and I would like to be
able to split the file into 3 parts seperated from specific time
locations. For example I want to divide the file into 3 parts, first
part containing 3 seconds of data, second containing 2 seconds of data
and third containing 3 seconds. Splitting based on file size doesn't
work that accurately for this specific data, some columns become
missing and etc. I need to split depending on the column content ;
1 - read file until first character of column1 is 3 (3 seconds)
2 - save this region to another file
3 - read the file where first characters of column1 are between 3 to
5 (2 seconds)
4 - save this region to another file
5 - read the file where first characters of column1 are between 5 to
5 (3 seconds)
6 - save this region to another file
I need to do this exactly because numpy.loadtxt or genfromtxt doesn't
get well with missing columns / rows. I even tried the invalidraise
parameter of genfromtxt but no luck.
I am sure it's a few lines of code for experienced users and I would
appreciate some guidance.
Here's a solution in Python 3:
input_path = "..."
section_1_path = "..."
section_2_path = "..."
section_3_path = "..."
with open(input_path) as input_file:
try:
line = next(input_file)
# Copy section 1.
with open(section_1_path, "w") as output_file:
while line[0] < "3":
output_file.write(line)
line = next(input_file)
# Copy section 2.
with open(section_2_path, "w") as output_file:
while line[5] < "5":
output_file.write(line)
line = next(input_file)
# Copy section 3.
with open(section_3_path, "w") as output_file:
while True:
output_file.write(line)
line = next(input_file)
except StopIteration:
pass
--
http://mail.python.org/mailman/listinfo/python-list